基于列标识符的聚合计数

1条回答

网友

1楼 · 发布于 2024-09-19 20:46:57

您可以将列转换为多索引，然后使用sum和unstack：

# Do this step if "id" is not already the index
# df = df.set_index('id')

df.columns = pd.MultiIndex.from_tuples(
    (f'att{a}', f'brand{b}') for _, a, b in df.columns.str.split('_'))

df.sum().unstack()

      brand1  brand2
att1     4.0     2.0
att2     4.0     5.0

让我们仔细看看：

df.columns = pd.MultiIndex.from_tuples(
    (f'att{a}', f'brand{b}') for _, a, b in df.columns.str.split('_'))

产量

   brand1 brand2 brand1 brand2
id                            
1     1.0    1.0    1.0      1
2     NaN    1.0    1.0      1
3     1.0    NaN    NaN      1
4     1.0    NaN    1.0      1
5     1.0    NaN    1.0      1

从这里开始，我们对所有列求和

df.sum()

att1  brand1    4.0
      brand2    2.0
att2  brand1    4.0
      brand2    5.0

最后，重塑结果，使其看起来像预期的输出

_.unstack()  # same as unstack(level=-1)

      brand1  brand2
att1     4.0     2.0
att2     4.0     5.0

相关问题更多 >

编程相关推荐

热门问题

热门文章

基于列标识符的聚合计数

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >