在多个索引上连接一列的字符串，同时保留其他列

>>> df1 = pandas.DataFrame({ "Name": ["Alice", "Marie", "Smith", "Mallory", "Bob", "Doe"], "City": ["Seattle", None, None, "Portland", None, None], "Age": [24, None, None, 26, None, None], "Group": [1, 1, 1, 2, 2, 2]}) >>> df1 Age City Group Name 0 24.0 Seattle 1 Alice 1 NaN None 1 Marie 2 NaN None 1 Smith 3 26.0 Portland 2 Mallory 4 NaN None 2 Bob 5 NaN None 2 Doe

2条回答

网友

1楼 · 编辑于 2024-06-23 19:03:50

试试这个：

In [29]: df1.groupby('Group').ffill().groupby(['Group','Age','City']).Name.apply(' '.join)
Out[29]:
Group  Age   City
1      24.0  Seattle     Alice Marie Smith
2      26.0  Portland      Mallory Bob Doe
Name: Name, dtype: object

网友

2楼 · 编辑于 2024-06-23 19:03:50

与groupby一起使用dropna和assign

docs to assign

df1.dropna(subset=['Age', 'City']) \
   .assign(Name=df1.groupby('Group').Name.apply(' '.join).values)

定时每个请求

更新
使用groupby和agg
我想到这一点，感觉更加满足

df1.groupby('Group').agg(dict(Age='first', City='first', Name=' '.join))

得到准确的输出

df1.groupby('Group').agg(dict(Age='first', City='first', Name=' '.join)) \
   .reset_index().reindex_axis(df1.columns, 1)

相关问题更多 >

编程相关推荐

热门问题

热门文章