例如，对按字符串索引分组的数据应用自定义函数

# The function def newEntropy(x): A = x pA = A / A.sum() Shannon2 = -np.nansum(pA * np.log2(pA)) return Shannon2 # Make fake data df = pd.DataFrame(np.random.rand(20,5), columns=list('abcde')) df['group'] = [0, 0, 0, 0, 1, 1, 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 5, 5] df['group2'] = [6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10] # Works df.groupby(['group', 'group2']).apply(newEntropy) # Having an index column that is a string causes failure df['group2'] = df['group2'].astype('str') df.groupby(['group', 'group2']).apply(newEntropy)

1条回答

网友

1楼 · 发布于 2024-06-26 00:07:43

你想在apply后面function的列有多具体

df.groupby(['group', 'group2'])[list('abcde')].apply(newEntropy)
Out[191]: 
group  group2
0      6         6.057044
       7        -0.000000
1      7         4.485942
2      7         4.879091
       8         3.727744
       9        -0.000000
3      9         4.751447
4      9        -0.000000
       10        8.993928
5      10        4.191522
dtype: float64

相关问题更多 >

编程相关推荐

热门问题

热门文章