按另一列进行近似插值分组

testing = pd.DataFrame({'col':[1,np.nan,np.nan,7,1,np.nan,np.nan,7], 'col2':['01-MAY-17 15:47:00','01-MAY-17 15:57:00', '07-MAY-17 15:47:00','07-MAY-17 22:07:00', '01-MAY-17 15:47:00','01-MAY-17 15:57:00', '07-MAY-17 15:47:00','07-MAY-17 22:07:00'], 'Customer_id':['A','A','A','A','B','B','B','B']})

testing['col2'] = pd.to_datetime(testing['col2']) testing['index1'] = testing.index testing = testing.set_index('col2') testing.apply(lambda group: group.interpolate(method= 'slinear')) test_int=testing.interpolate(method='slinear') test_int['col2'] = test_int.index test_int = test_int.set_index('index1') test_int

1条回答

网友

1楼 · 发布于 2024-06-24 11:45:34

IIUC，一旦有了set_index带有日期的列，就可以在每个组的interpolate中使用method='index'，例如：

testing.col2 = pd.to_datetime(testing.col2)
print (testing.set_index('col2').groupby('Customer_id')
              .apply(lambda x: x.interpolate(method= 'index')).reset_index())
                 col2       col Customer_id
0 2017-05-01 15:47:00  1.000000           A
1 2017-05-01 15:57:00  1.006652           A
2 2017-05-07 15:47:00  6.747228           A
3 2017-05-07 22:07:00  7.000000           A
4 2017-05-01 15:47:00  1.000000           B
5 2017-05-01 15:57:00  1.006652           B
6 2017-05-07 15:47:00  6.747228           B
7 2017-05-07 22:07:00  7.000000           B

相关问题更多 >

编程相关推荐

热门问题

热门文章