假设我有一个pandas数据帧df
,我想插值缺失的值。你知道吗
第一种情况,我尝试插入整个数据帧df
。但不知怎么的,我收到了一条警告信息,结果失败了。你知道吗
[In] interpolateList = [x for x in xlsx_df.columns if x not in ['Date', 'Time', 'DateTime', 'Year', 'YearMonth']]
# interpolation
[In] xlsx_df[interpolateList].interpolate(method='linear', inplace=True) # axis: default 0, which means col by col
print('Whether there are any NaN value: ', xlsx_df.isnull().values.any())
[Out] Whether there are any NaN value: True
/home/usrname/.local/lib/python3.6/site-packages/ipykernel_launcher.py:4: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy after removing the cwd from sys.path.
在另一个例子中,我尝试插值每一列(这意味着它是一个熊猫系列),它的工作方式和我预期的一样。你知道吗
我用可视化工具对结果进行了双重检查,结果看起来很棒。你知道吗
[In] interpolateList = [x for x in xlsx_df.columns if x not in ['Date', 'Time', 'DateTime', 'Year', 'YearMonth']]
# interpolation
[In] for col in interpolateList:
xlsx_df[col].interpolate(method='linear', inplace=True) # axis: default 0, which means col by col
print('Are there any NaN value: ', xlsx_df.isnull().values.any())
[Out] Whether there are any NaN value: False
为什么案例1失败了?是不是因为我选错了dataframe的列?你知道吗
问题是,您正在尝试将新值分配给原始数据帧的子集,正如警告消息所示:“试图在数据帧的切片副本上设置值。”
您需要显式指定要使用
xlsx_df[interpolateList] = xlsx_df[interpolateList].interpolate(method='linear')
重新定义的数据帧片,如下所示:相关问题 更多 >
编程相关推荐