python2.7:dataframegroupby并查找组中值的百分比分布

{'driverRef': {0: 'vettel', 1: 'raikkonen', 2: 'rosberg', 4: 'hamilton', 6: 'ricciardo', 7: 'alonso', 14: 'haryanto'}, 'race': {0: 'Australian Grand Prix', 1: 'Australian Grand Prix', 2: 'Australian Grand Prix', 4: 'Australian Grand Prix', 6: 'Australian Grand Prix', 7: 'Australian Grand Prix', 14: 'Australian Grand Prix'}, 'stint': {0: 1.0, 1: 1.0, 2: 1.0, 4: 1.0, 6: 1.0, 7: 1.0, 14: 1.0}, 'total diff': {0: 125147.50728499777, 1: 281292.0366694695, 2: 166278.41312954266, 4: 64044.234019635056, 6: 648383.28046950256, 7: 400675.77449897071, 14: 2846411.2560531585}, 'tyre': {0: u'Super soft', 1: u'Super soft', 2: u'Super soft', 4: u'Super soft', 6: u'Super soft', 7: u'Super soft', 14: u'Super soft'}}

1条回答

网友

1楼 · 发布于 2024-05-19 10:24:20

如果我正确理解您的需求，这可能会有所帮助：

sums = df.groupby(['race', 'tyre', 'stint'])['total diff'].sum()
df = df.set_index(['race', 'tyre', 'stint']).assign(pct=sums).reset_index()
df['pct'] = df['total diff'] / df['pct']

#                     race        tyre  stint  driverRef    total diff       pct
# 0  Australian Grand Prix  Super soft    1.0     vettel  1.251475e+05  0.027613
# 1  Australian Grand Prix  Super soft    1.0  raikkonen  2.812920e+05  0.062065
# 2  Australian Grand Prix  Super soft    1.0    rosberg  1.662784e+05  0.036688
# 3  Australian Grand Prix  Super soft    1.0   hamilton  6.404423e+04  0.014131
# 4  Australian Grand Prix  Super soft    1.0  ricciardo  6.483833e+05  0.143060
# 5  Australian Grand Prix  Super soft    1.0     alonso  4.006758e+05  0.088406
# 6  Australian Grand Prix  Super soft    1.0   haryanto  2.846411e+06  0.628037

相关问题更多 >

编程相关推荐

热门问题

热门文章