依据其他列的条件填充NaN的Pandas

2024-06-18 14:14:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我想用活动站的值填充None值。 数据如下所示,我创建了一些列以使条件反射更容易。你知道吗

Shift_id    activity_name   activity_id activity_begin_time activity_end_time   activity_station    shift   code    day
0   123 start   D01-MCK-DI  09:00   09:05   None    D01 MCK DI
1   123 work    D01-MCK-DI  09:05   12:00   Za      D01 MCK DI
2   123 drive   D01-MCK-DI  12:00   12:30   Ro      D01 MCK DI
3   184 start   D01-MV-DI   09:00   09:05   None    D01 MV  DI
4   184 work    D01-MV-DI   09:05   12:00   Ca      D01 MV  DI
5   184 drive   D01-MV-DI   12:00   12:30   None    D01 MV  DI

如果需要,请加载数据元素:

    df = pd.DataFrame({ 
    'Shift_id' :[ 123,123,123,184,184,184],
    'activity_name':['start','work','drive','start','work','drive'],
    'activity_id' : ['D01-MCK-DI','D01-MCK-DI','D01-MCK-DI','D01-MV-DI','D01-MV-DI','D01-MV-DI'],
    'activity_begin_time' : ['09:00','09:05','12:00','09:00','09:05','12:00'],
    'activity_end_time' : ['09:05','12:00','12:30','09:05','12:00','12:30'],
    'activity_station' : ['None', 'Za','Ro','None', 'Ca','None']})

df[['shift','code','day']] = df['activity_id'].str.split(pat="-", expand=True)

如果MV在列上有一个None值

然后查看MV和MCK的班次和日期相同的地方,并将MCK的活动性站值指定给MV的无值

我尝试了一些IF-else返回语句,但最终没有成功。你知道吗

结果应该是这样的:

    Shift_id    activity_name   activity_id activity_begin_time activity_end_time   activity_station    shift   code    day
0   123 start   D01-MCK-DI  09:00   09:05   None    D01 MCK DI
1   123 work    D01-MCK-DI  09:05   12:00   Za      D01 MCK DI
2   123 drive   D01-MCK-DI  12:00   12:30   Ro      D01 MCK DI
3   184 start   D01-MV-DI   09:00   09:05   None    D01 MV  DI
4   184 work    D01-MV-DI   09:05   12:00   Ca      D01 MV  DI
5   184 drive   D01-MV-DI   12:00   12:30   Ro      D01 MV  DI

Tags: namenoneidshiftrotimedriveactivity
1条回答
网友
1楼 · 发布于 2024-06-18 14:14:31

IIUC,您还需要一个分组列来实现所需的输出。您当前描述的分组是shiftday,但这仍然只生成一个组,因此我假设您还打算按activity_name分组。如果是这种情况,那么可以在用np.nan替换数据帧中的None值之后使用transform()(即NaN):

df['activity_station'] = df.groupby(['shift','day','activity_name'])['activity_station'].transform(lambda x: x.ffill())

这将产生所需的输出:

   Shift_id activity_name activity_id activity_begin_time activity_end_time  \
0       123         start  D01-MCK-DI               09:00             09:05   
1       123          work  D01-MCK-DI               09:05             12:00   
2       123         drive  D01-MCK-DI               12:00             12:30   
3       184         start   D01-MV-DI               09:00             09:05   
4       184          work   D01-MV-DI               09:05             12:00   
5       184         drive   D01-MV-DI               12:00             12:30   

  activity_station shift code day  
0              NaN   D01  MCK  DI  
1               Za   D01  MCK  DI  
2               Ro   D01  MCK  DI  
3              NaN   D01   MV  DI  
4               Ca   D01   MV  DI  
5               Ro   D01   MV  DI  

相关问题 更多 >