我有一个像这样的数据帧
Code DIAG
Time
1999-12-01 00:00:01.870 None
1999-12-01 00:00:10.870 None
2000-01-01 09:10:09.870 None
2000-01-01 09:10:10.870 None
2000-01-01 09:00:10.940 None
2000-01-01 09:00:11.160 None
2000-01-01 09:00:11.640 None
2000-01-01 09:00:12.460 None
2010-01-01 09:00:34.910 1_19_1_4_0_0
2010-01-01 09:00:35.060 3_22_4_0_0_0
2010-01-01 09:00:35.120 6_22_10_3_0_0
我只想在每个数据前一小时回填丢失的数据,并更改标签,使数据看起来像这样
Code DIAG
Time
1999-12-01 00:00:01.870 None
1999-12-01 00:00:10.870 None
2000-01-01 09:10:09.870 1_19_1_4_0_0_H
2000-01-01 09:10:10.870 1_19_1_4_0_0_H
2000-01-01 09:00:10.940 1_19_1_4_0_0_H
2000-01-01 09:00:11.160 1_19_1_4_0_0_H
2000-01-01 09:00:11.640 1_19_1_4_0_0_H
2000-01-01 09:00:12.460 1_19_1_4_0_0_H
2010-01-01 09:00:34.910 1_19_1_4_0_0_H
2010-01-01 09:00:35.060 3_22_4_0_0_0
2010-01-01 09:00:35.120 6_22_10_3_0_0
我写了这个代码,它看起来是这样的:
def FillData(dff):
s=dff.bfill()
s.loc[s.notnull()]=s.astype('str').astype('str')+'_H'
return s
df=A['DIAG'].groupby(pd.Grouper(freq='H')).apply(FillData)
问题是这会产生如下输出:
Code DIAG
Time
1999-12-01 00:00:01.870 None
1999-12-01 00:00:10.870 None
2000-01-01 09:10:09.870 None
2000-01-01 09:10:10.870 None
2000-01-01 09:00:10.940 None
2000-01-01 09:00:11.160 None
2000-01-01 09:00:11.640 None
2000-01-01 09:00:34.460 1_19_1_4_0_0_H
2010-01-01 09:00:34.910 1_19_1_4_0_0_H
2010-01-01 09:00:35.060 3_22_4_0_0_0_H
2010-01-01 09:00:35.120 6_22_10_3_0_0_H
我看到了两个主要问题,第一个是groupby不是按H分组,而是只按分钟分组。另一个问题是它正在向所有行添加label(\uh)。我的主要目标是在数据发生前1小时用H标记,在数据发生前1周用W标记
我很感激如果有人能帮我,我花了很多时间,但我找不到直截了当的方法
谢谢
目前没有回答
相关问题 更多 >
编程相关推荐