更新同一索引的行

yellowCard secondYellow redCard match_id player_id 1431183600x96x30 76921 X NaN NaN 76921 NaN X X 1431192600x162x32 71174 X NaN NaN

yellowCard secondYellow redCard match_id player_id 1431183600x96x30 76921 X X X 1431192600x162x32 71174 X NaN NaN

2条回答

网友

1楼 · 编辑于 2024-09-28 01:32:43

看起来您的df在match_id和player_id上有多个索引，因此我将在match_id上执行^{}，并填充NaN值两次，ffill和bfill：

In [184]:
df.groupby(level=0).fillna(method='ffill').groupby(level=0).fillna(method='bfill')

Out[184]:
                             yellowCard  secondYellow  redCard
match_id          player_id                                   
1431183600x96x30  76921               1             2        2
                  76921               1             2        2
1431192600x162x32 71174               3           NaN      NaN

我使用以下代码构建上述内容，而不是使用x值：

In [185]:
t="""match_id player_id yellowCard secondYellow redCard
1431183600x96x30  76921              1          NaN     NaN
1431183600x96x30  76921            NaN           2       2
1431192600x162x32 71174              3          NaN     NaN"""
df=pd.read_csv(io.StringIO(t), sep='\s+', index_col=[0,1])
df

Out[185]:
                             yellowCard  secondYellow  redCard
match_id          player_id                                   
1431183600x96x30  76921               1           NaN      NaN
                  76921             NaN             2        2
1431192600x162x32 71174               3           NaN      NaN

编辑groupby对象有^{}和^{}方法，因此简化为：

In [189]:
df.groupby(level=0).ffill().groupby(level=0).bfill()

Out[189]:
                             yellowCard  secondYellow  redCard
match_id          player_id                                   
1431183600x96x30  76921               1             2        2
                  76921               1             2        2
1431192600x162x32 71174               3           NaN      NaN

然后可以调用^{}：

In [190]:
df.groupby(level=0).ffill().groupby(level=0).bfill().drop_duplicates()

Out[190]:
                             yellowCard  secondYellow  redCard
match_id          player_id                                   
1431183600x96x30  76921               1             2        2
1431192600x162x32 71174               3           NaN      NaN

网友

2楼 · 编辑于 2024-09-28 01:32:43

如果你做了一个

df.groupbby([df.match_id, df.player_id]).min()

NaN的默认行为将忽略它们。对于示例中表单的数据帧（所有不一致都在NaN和填充值之间），这将完成这项工作。你知道吗

编辑

我假设X值是float的占位符。对于字符串，使用ffill和bfill的组合，比如EdChums answer（应该接受）。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章