标识数据帧中任何相等元素组合的出现次数

DK1 DK2 NO1 NO2 NO3 NO4 0 10 10 12 15 15 10 1 15 10 15 10 10 10 2 15 15 15 15 15 15 3 10 10 12 15 15 10 4 10 10 10 10 15 15

1条回答

网友

1楼 · 发布于 2024-09-27 07:34:23

通过^{}按第一列比较数据帧，然后使用^{}与带分隔符的列名进行矩阵相乘，使用^{}进行最后计数，并转换为DataFrame：

df = (df.eq(df['DK1'], axis=0)
        .dot(df.columns + ',')
        .str[:-1]
        .value_counts()
        .rename_axis('Combo')
        .reset_index(name='Occurrence'))
print (df)
                     Combo  Occurrence
0              DK1,DK2,NO4           2
1                  DK1,NO1           1
2          DK1,DK2,NO1,NO2           1
3  DK1,DK2,NO1,NO2,NO3,NO4           1

编辑：对于组，可以通过所有值创建字典，然后调用replace：

s = df.columns.to_series()
s.index = s.index.str.replace('\d+','', regex=True)

d = s.groupby(level=0).agg(','.join).to_dict()
d = {v:k for k, v in d.items()}
print (d)
{'DK1,DK2': 'DK', 'NO1,NO2,NO3,NO4': 'NO'}

df = (df.eq(df['DK1'], axis=0)
        .dot(df.columns + ',')
        .str[:-1]
        .value_counts()
        .rename_axis('Combo')
        .reset_index(name='Occurrence'))

df['Combo'] = df['Combo'].replace(d, regex=True)
print (df)
        Combo  Occurrence
0      DK,NO4           2
1     DK1,NO1           1
2  DK,NO1,NO2           1
3       DK,NO           1

相关问题更多 >

编程相关推荐

热门问题

热门文章

标识数据帧中任何相等元素组合的出现次数

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >