如何直接从groupby创建数据帧

operatiejaar spoedelectief 2017 E 5459 S 1054 2018 E 6191 S 1029 2019 E 6160 S 1159

test = df_new.groupby(df_new['operatiejaar'])['spoedelectief'].value_counts().sort_index() jaar_list = [] spel_list = [] totaal = [] for index, value in test.items(): jaar_list.append(index[0]) spel_list.append(index[1]) totaal.append(value) spel_jaar = pd.DataFrame( {'jaar': jaar_list, 'spoedelectief': spel_list, 'totaal': totaal })

2条回答

网友

1楼 · 编辑于 2024-09-28 05:25:48

在^{}之前需要rename系列：

test = (df_new.groupby(df_new['operatiejaar'])['spoedelectief']
              .value_counts()
              .rename('count')
              .sort_index()
              .reset_index())

或者在^{}中使用name：

test = (df_new.groupby(df_new['operatiejaar'])['spoedelectief']
              .value_counts()
              .sort_index()
              .reset_index(name='count'))

网友

2楼 · 编辑于 2024-09-28 05:25:48

需要考虑的另外两个选择：

^{}：

test = (
    df_new.groupby('operatiejaar')['spoedelectief']
    .value_counts().to_frame('totaal').reset_index()
)

将结果重塑为多个列，每个列对应value_counts找到的名称：
也可以避免命名系列，而是将其展开为两列，以便更好地打印：
```
# 'E' and 'S' counts become two columns
test2 = (
    df_new.groupby('operatiejaar')['spoedelectief']
    .value_counts().unstack()
)
test2.plot.bar()
```
示例（关于随机生成的小数据）：

注释：

您可以省去df_new[column_name]作为groupby的参数，只需指定column_name
您不必sort_index()（至少在熊猫的最新版本中是这样）：默认情况下groupby()和value_counts()排序

相关问题更多 >

编程相关推荐

热门问题

热门文章