尝试使用在多列中显示的相同值创建数据帧

import pandas as pd import numpy as np df = pd.DataFrame([['Jazz', 'Clippers', 89, 100], ['Clippers' , 'Jazz', 101, 97], ['Bucks' , 'Jazz', 99, 112], ['Jazz' , 'Bucks', 109, 88]], columns=['home_team', 'away_team', 'home_points', 'away_points']) print(df)

home_team away_team home_points away_points 0 Jazz Clippers 89 100 1 Clippers Jazz 101 97 2 Bucks Jazz 99 112 3 Jazz Bucks 109 88

home_team away_team home_points away_points home_cumulative_points away_cumulative_points 0 Jazz Clippers 89 100 89 100 1 Clippers Jazz 101 97 101 97 2 Bucks Jazz 99 112 99 209 3 Jazz Bucks 109 88 198 88

1条回答

网友

1楼 · 发布于 2024-09-28 22:21:29

想法是只选择必要的列，用_分割MultiIndex，用^{}重塑，这样就可以对两个列同时使用cumsum：

cols = ['home_team', 'away_team', 'home_points', 'away_points']

df1 = df[cols].copy()
df1.columns = df1.columns.str.split('_', expand=True)
df1 = df1.stack(0).rename_axis(['lev1','lev2'])
df1["cumulative_points"]= df1.groupby(["team", 'lev1'])["points"].cumsum() 

df2 = df1.unstack()
df2.columns = df2.columns.map(lambda x: f'{x[1]}_{x[0]}')
print(df2)
      away_points  home_points away_team home_team  away_cumulative_points  \
lev1                                                                         
0             100           89  Clippers      Jazz                     100   
1              97          101      Jazz  Clippers                      97   
2             112           99      Jazz     Bucks                     112   
3              88          109     Bucks      Jazz                      88   

      home_cumulative_points  
lev1                          
0                         89  
1                        101  
2                         99  
3                        109

或：

df["home_cumulative_points"]= df1.loc['home', 'cumulative_points']
df["away_cumulative_points"]= df1.loc['away', 'cumulative_points']

另一种方法是使用^{}和rename进行重塑：

f = lambda x: x.split('_')[1]
df1 = pd.concat([df[['home_team', 'home_points']].rename(columns=f),
                 df[['away_team', 'away_points']].rename(columns=f)], keys=('home','away'))
df1 = df1.rename_axis(['lev1','lev2'])
df1["cumulative_points"]= df1.groupby(["team", 'lev1'])["points"].cumsum()

df["home_cumulative_points"]= df1.loc['home', 'cumulative_points']
df["away_cumulative_points"]= df1.loc['away', 'cumulative_points']
print(df)
  home_team away_team  home_points  away_points  home_cumulative_points  \
0      Jazz  Clippers           89          100                      89   
1  Clippers      Jazz          101           97                     101   
2     Bucks      Jazz           99          112                      99   
3      Jazz     Bucks          109           88                     198   

   away_cumulative_points  
0                     100  
1                      97  
2                     209  
3                      88

相关问题更多 >

编程相关推荐

热门问题

热门文章