如何在时间不是时间戳的情况下应用滚动函数并按组应用

2024-10-03 23:26:07 发布

您现在位置:Python中文网/ 问答频道 /正文

我想对一个数据帧应用滚动函数,这个数据帧不是由日期戳索引的,而是由实验开始后的相对星期索引的。 例如,这是我的起始数据帧:

df = pd.DataFrame([{'RelWeek': 3, 'id': 33, 'Val_1': 20,'Val_2':0.6,'Val_3':40,'TimeofDay':'MORN'}
,{'RelWeek': 3, 'id': 33, 'Val_1': 40,'Val_2':0.9,'Val_3':27,'TimeofDay':'NOON'}
,{'RelWeek': 3, 'id': 33, 'Val_1': 24,'Val_2':0.65,'Val_3':44,'TimeofDay':'EVE'}
,{'RelWeek': 4, 'id': 33, 'Val_1': 25,'Val_2':0.34,'Val_3':55,'TimeofDay':'MORN'}
,{'RelWeek': 4, 'id': 33, 'Val_1': 21,'Val_2':0.61,'Val_3':34,'TimeofDay':'NOON'}
,{'RelWeek': 4, 'id': 33, 'Val_1': 25,'Val_2':0.67,'Val_3':21,'TimeofDay':'EVE'}
,{'RelWeek': 5, 'id': 33, 'Val_1': 20,'Val_2':0.6,'Val_3':40,'TimeofDay':'MORN'}
,{'RelWeek': 5, 'id': 33, 'Val_1': np.nan,'Val_2':np.nan,'Val_3': np.nan,'TimeofDay':'NOON'}
,{'RelWeek': 5, 'id': 33, 'Val_1': 25,'Val_2':0.68,'Val_3':48,'TimeofDay':'EVE'}
,{'RelWeek': 3, 'id': 33, 'Val_1': 20,'Val_2':0.6,'Val_3':40,'TimeofDay':'MORN'}
,{'RelWeek': 3, 'id': 33, 'Val_1': 40,'Val_2':0.9,'Val_3':27,'TimeofDay':'NOON'}
,{'RelWeek': 2, 'id': 34, 'Val_1': 24,'Val_2':0.65,'Val_3':44,'TimeofDay':'EVE'}
,{'RelWeek': 2, 'id': 34, 'Val_1': 25,'Val_2':0.34,'Val_3':55,'TimeofDay':'MORN'}
,{'RelWeek': 2, 'id': 34, 'Val_1': 21,'Val_2':0.61,'Val_3':34,'TimeofDay':'NOON'}
,{'RelWeek': 3, 'id': 34, 'Val_1': 25,'Val_2':0.67,'Val_3':21,'TimeofDay':'EVE'}
,{'RelWeek': 3, 'id': 34, 'Val_1': 20,'Val_2':0.6,'Val_3':40,'TimeofDay':'MORN'}
,{'RelWeek': 3, 'id': 34, 'Val_1': 25,'Val_2':0.68,'Val_3':48,'TimeofDay':'EVE'}])

df

    RelWeek TimeofDay  Val_1  Val_2  Val_3  id
0         3      MORN   20.0   0.60   40.0  33
1         3      NOON   40.0   0.90   27.0  33
2         3       EVE   24.0   0.65   44.0  33
3         4      MORN   25.0   0.34   55.0  33
4         4      NOON   21.0   0.61   34.0  33
5         4       EVE   25.0   0.67   21.0  33
6         5      MORN   20.0   0.60   40.0  33
7         5      NOON    NaN    NaN    NaN  33
8         5       EVE   25.0   0.68   48.0  33
9         3      MORN   20.0   0.60   40.0  33
10        3      NOON   40.0   0.90   27.0  33
11        2       EVE   24.0   0.65   44.0  34
12        2      MORN   25.0   0.34   55.0  34
13        2      NOON   21.0   0.61   34.0  34
14        3       EVE   25.0   0.67   21.0  34
15        3      MORN   20.0   0.60   40.0  34
16        3       EVE   25.0   0.68   48.0  34


I want to do 3 things now

  1. groupby('id' ,Timeof Day')
  2. Take the max and min 'RelWeek for each id and iterate over RelWeek. It is important to note that relative week is different for each id . Implement a rolling function for Window= 3 or ( a flexible number) over Range of min to max Iter value
  3. the output of the rolling function should appear as - "Original column name_TimeofDay_function'(By function I mean the aggregate function applied to rolling window, for e.g Min, Max, Stdev, Mean etc )

输出:(下面是理想输出的外观就像。那个数字现在只是虚数,并不反映实际的平均值或标准偏差等。列的名称真正反映了列的内容应该是什么。任何替代的输出方式,只要它给我的信息,上述要求也工作

Output_df

      Val_1_EVE_Mean  Val_1_EVE_Stdev  Val_1_MORN_Mean  Val_1_MORN_Stdev  \
0            24.6             24.6             21.6              21.6   
1            24.6             24.6             21.6              21.6   
2            24.6             24.6             21.6              21.6   
3            24.6             24.6             21.6              21.6   
4            24.6             24.6             21.6              21.6   
5            24.6             24.6             21.6              21.6   

      Val_1_NOON_Mean  Val_1_NOON_Stdev  Val_2_EVE_Mean  Val_2_EVE_Stdev  \
0             33.6              33.6            24.6             24.6   
1             33.6              33.6            24.6             24.6   
2             33.6              33.6            24.6             24.6   
3             33.6              33.6            24.6             24.6   
4             33.6              33.6            24.6             24.6   
5             33.6              33.6            24.6             24.6   

   Val_2_MORN_Mean  Val_2_MORN_Stdev  Val_2_NOON_Mean  Val_2_NOON_Stdev  \
0             0.51              0.51             33.6              33.6   
1             0.51              0.51             33.6              33.6   
2             0.51              0.51             33.6              33.6   
3             0.51              0.51             33.6              33.6   
4             0.51              0.51             33.6              33.6   
5             0.51              0.51             33.6              33.6   

      Val_3_EVE_Mean  Val_3_EVE_Stdev  Val_3_MORN_Mean  Val_3_MORN_Stdev  \
0            24.6             24.6             21.6              21.6   
1            24.6             24.6             21.6              21.6   
2            24.6             24.6             21.6              21.6   
3            24.6             24.6             21.6              21.6   
4            24.6             24.6             21.6              21.6   
5            24.6             24.6             21.6              21.6   

   Val_3_NOON_Mean  Val_3_NOON_Stdev     Window  id  
0             33.6              33.6  [3, 4, 5]  33  
1             33.6              33.6  [4, 5, 6]  33  
2             33.6              33.6  [5, 6, 7]  33  
3             33.6              33.6  [2, 3, 4]  34  
4             33.6              33.6  [3, 4, 5]  34  
5             33.6              33.6  [4, 5, 6]  34  

提前感谢专家们能给我一些方向


Tags: theto数据iddfforfunctionval