如何用0.24.1替换NaN和NaT

2024-09-30 22:10:35 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要用None替换pandas.Series中的所有NaNNaT。你知道吗

我试过这个:

def replaceMissing(ser):
    return ser.where(pd.notna(ser), None)

但它不起作用:

import pandas as pd

NaN = float('nan')
NaT = pd.NaT

floats1 = pd.Series((NaN, NaN, 2.71828, -2.71828))
floats2 = pd.Series((2.71828, -2.71828, 2.71828, -2.71828))
dates = pd.Series((NaT, NaT, pd.Timestamp("2019-07-09"), pd.Timestamp("2020-07-09")))


def replaceMissing(ser):
    return ser.where(pd.notna(ser), None)


print(pd.__version__)
print(80*"-")
print(replaceMissing(dates))
print(80*"-")
print(replaceMissing(floats1))
print(80*"-")
print(replaceMissing(floats2))

如您所见,NaT没有被替换:

0.24.1
--------------------------------------------------------------------------------
0          NaT
1          NaT
2   2019-07-09
3   2020-07-09
dtype: datetime64[ns]
--------------------------------------------------------------------------------
0       None
1       None
2    2.71828
3   -2.71828
dtype: object
--------------------------------------------------------------------------------
0    2.71828
1   -2.71828
2    2.71828
3   -2.71828
dtype: float64

然后我尝试了这个额外的步骤:

def replaceMissing(ser):
    ser = ser.where(pd.notna(ser), None)
    return ser.replace({pd.NaT: None})

但它仍然不起作用。出于某种原因,它会把NaN带回来:

0.24.1
--------------------------------------------------------------------------------
0                   None
1                   None
2    2019-07-09 00:00:00
3    2020-07-09 00:00:00
dtype: object
--------------------------------------------------------------------------------
0        NaN
1        NaN
2    2.71828
3   -2.71828
dtype: float64
--------------------------------------------------------------------------------
0    2.71828
1   -2.71828
2    2.71828
3   -2.71828
dtype: float64

我还尝试将序列转换为object

def replaceMissing(ser):
    return ser.astype("object").where(pd.notna(ser), None)

但是现在最后一个系列也是object,即使它没有缺少值:

0.24.1
--------------------------------------------------------------------------------
0                   None
1                   None
2    2019-07-09 00:00:00
3    2020-07-09 00:00:00
dtype: object
--------------------------------------------------------------------------------
0       None
1       None
2    2.71828
3   -2.71828
dtype: object
--------------------------------------------------------------------------------
0    2.71828
1   -2.71828
2    2.71828
3   -2.71828
dtype: object

我希望它保持float64。所以我加上infer_objects

def replaceMissing(ser):
    return ser.astype("object").where(pd.notna(ser), None).infer_objects()

但它又把NaN带回来了:

0.24.1
--------------------------------------------------------------------------------
0                   None
1                   None
2    2019-07-09 00:00:00
3    2020-07-09 00:00:00
dtype: object
--------------------------------------------------------------------------------
0        NaN
1        NaN
2    2.71828
3   -2.71828
dtype: float64
--------------------------------------------------------------------------------
0    2.71828
1   -2.71828
2    2.71828
3   -2.71828
dtype: float64

我觉得一定有个简单的方法。有人知道吗?你知道吗


Tags: nonereturnobjectdefnanwherenatser
1条回答
网友
1楼 · 发布于 2024-09-30 22:10:35

对于我来说,您的第二个解决方案的工作更改顺序,在0.24.2中测试,但是dtypes更改为object,因为混合类型-Nones与floats或timestamps:

def replaceMissing(ser):
    return ser.replace({pd.NaT: None}).where(pd.notna(ser), None)

print(pd.__version__)
print(80*"-")
print(replaceMissing(dates))
print(80*"-")
print(replaceMissing(dates).apply(type))
print(80*"-")
print(replaceMissing(floats1))
print(80*"-")
print(replaceMissing(floats1).apply(type))
print(80*"-")
print(replaceMissing(floats2))

0.24.2
                                        
0                   None
1                   None
2    2019-07-09 00:00:00
3    2020-07-09 00:00:00
dtype: object
                                        
0                                   <class 'NoneType'>
1                                   <class 'NoneType'>
2    <class 'pandas._libs.tslibs.timestamps.Timesta...
3    <class 'pandas._libs.tslibs.timestamps.Timesta...
dtype: object
                                        
0       None
1       None
2    2.71828
3   -2.71828
dtype: object
                                        
0    <class 'NoneType'>
1    <class 'NoneType'>
2       <class 'float'>
3       <class 'float'>
dtype: object
                                        
0    2.71828
1   -2.71828
2    2.71828
3   -2.71828
dtype: float64

相关问题 更多 >