利用时间风中的两个索引对大Pandas数据帧进行去栈

| car_id | timestamp | gas | odometer | temperature | |--------|---------------------|-----|----------|-------------| | aac43f | 2019-10-05 14:00:00 | 70 | 152042 | 87 | | aac43f | 2019-10-05 15:00:00 | 63 | 152112 | 88 | | aac43f | 2019-10-05 18:00:00 | 44 | 152544 | 93 | | bg112 | 2019-08-22 09:00:00 | 90 | 1242 | 85 | | bg112 | 2019-08-22 10:00:00 | 89 | 1270 | 85 | | 32rre | 2019-01-01 12:00:00 | 20 | 84752 | 74 |

| car_id | timestamp | gas | gas_1h_ago | gas_2h_ago | odometer | o_1h | o_2h | temperature | t_1h_ago | t_2h_ago | |--------|---------------------|-----|------------|------------|----------|--------|------|-------------|----------|----------| | aac43f | 2019-10-05 14:00:00 | 70 | NaN | NaN | 152042 | NaN | NaN | 87 | NaN | NaN | | aac43f | 2019-10-05 15:00:00 | 63 | 70 | NaN | 152112 | 152042 | NaN | 88 | 87 | NaN | | aac43f | 2019-10-05 18:00:00 | 44 | NaN | NaN | 152544 | NaN | NaN | 93 | NaN | NaN | | bg112 | 2019-08-22 09:00:00 | 90 | NaN | NaN | 1242 | NaN | NaN | 85 | NaN | NaN | | bg112 | 2019-08-22 10:00:00 | 89 | 90 | NaN | 1270 | 1242 | NaN | 85 | 85 | NaN | | 32rre | 2019-01-01 12:00:00 | 20 | NaN | NaN | 84752 | NaN | NaN | 74 | NaN | NaN |

1条回答

网友

1楼 · 发布于 2024-06-23 18:30:18

你可以使用^{}

用^{}做几个小时的样本。使用^{}+^{}传输每个小时的当前时间值。使用^{}将数据帧返回到其原始行。在使用^{}之前，根据x小时向列添加后缀执行此操作x小时，然后使用^{}加入它。你知道吗

最后，用^{}再次将得到的eldataframe与原eldataframe合并。用^{}+^{}+^{}重新排列列

hours_ago = [1,2]

#Creating a DataFrame by hour ago and concat

df_x_hours_ago= (

pd.concat(

[( df.groupby('car_id')
     .apply(lambda x: x.resample('H',on='timestamp')
                       .sum(min_count=1)
                       .shift(hour))
     .reset_index(level='car_id',drop='car_id')                 
     .reindex(index=df['timestamp'])
     .add_suffix(f'_{hour}h_ago')
     .reset_index(drop=True))

   for hour in hours_ago],
axis=1)

)
#Concat and ordering columns:

new_df=( pd.concat([df,df_x_hours_ago],axis=1)
           .set_index(['car_id','timestamp'])
           .sort_index(axis=1)
           .reset_index() )
print(new_df)

输出

   car_id           timestamp  gas  gas_1h_ago  gas_2h_ago  odometer  \
0  aac43f 2019-10-05 14:00:00   70         NaN         NaN    152042   
1  aac43f 2019-10-05 15:00:00   63        70.0         NaN    152112   
2  aac43f 2019-10-05 18:00:00   44         NaN         NaN    152544   
3   bg112 2019-08-22 09:00:00   90         NaN         NaN      1242   
4   bg112 2019-08-22 10:00:00   89        90.0         NaN      1270   
5   32rre 2019-01-01 12:00:00   20         NaN         NaN     84752   

   odometer_1h_ago  odometer_2h_ago  temperature  temperature_1h_ago  \
0              NaN              NaN           87                 NaN   
1         152042.0              NaN           88                87.0   
2              NaN              NaN           93                 NaN   
3              NaN              NaN           85                 NaN   
4           1242.0              NaN           85                85.0   
5              NaN              NaN           74                 NaN   

   temperature_2h_ago  
0                 NaN  
1                 NaN  
2                 NaN  
3                 NaN  
4                 NaN  
5                 NaN

用0填充删除min_count=1

相关问题更多 >

编程相关推荐

热门问题

热门文章