年份和名称的Python求和值

2024-10-16 20:49:28 发布

您现在位置:Python中文网/ 问答频道 /正文

我有数据,我想总结每一年和球员。数据如下所示

Out[28]: 
      Year            Player                 Tm   Yds
0     1970       Larry Brown  Arizona Cardinals  1125
1     1970       Ron Johnson  Arizona Cardinals  1027
2     1970    MacArthur Lane  Arizona Cardinals   977
3     1970      Floyd Little  Arizona Cardinals   901
4     1970      Larry Csonka  Arizona Cardinals   874
   ...               ...                ...   ...
1270  2020       Gus Edwards  Arizona Cardinals   723
1271  2020      James Conner  Arizona Cardinals   721
1272  2020     David Johnson  Arizona Cardinals   691
1273  2020     Damien Harris  Arizona Cardinals   691
1274  2020  Devin Singletary  Arizona Cardinals   687

因此,每年“玩家”的“Yds”都会变大,我计划每年绘制它们,看看谁的码数最多

我在下面尝试过这个,但它只是给了我每个人的总数

df = pd.read_csv('D:RunningBackYards_3.csv', 
                 usecols=['Player', 'Tm', 'Year', 'Yds'])
r = len(df)

print(df.loc[1:r,['Yds']].sum())

counter = collections.Counter()
for ii in df.Player:
    counter.update(ii)
    
result = dict(counter)

非常感谢您的帮助,谢谢


Tags: csv数据dfcounteroutyeartmii
2条回答

我想groupby在这里更合适。这应该可以做到:

sum_df = df \
    .groupby(['Year', 'Player']) \
    .agg({'Yds': 'sum'})

修正了代码的工作原理。找到每个游戏者的索引,for循环对第一个条目、两个条目、三个条目求和,以此类推

df = pd.read_csv('D:\RunningBackYards_3.csv', 
                 usecols=['Player', 'Tm', 'Year', 'Yds'])

unique_player = set(df.Player)
unique_year = set(df.Year)  

        
yr = [[]]
plyr = [[]]
yards = [[]]

for ii in unique_player:
    idxPt = df.index[(df['Player'] == ii)]
    idx = 1
    for kk in idxPt:   
        yards.append(sum(df.Yds[idxPt[0:idx]]))
        yr.append(df.Year[kk])
        plyr.append(df.Player[kk])
        idx = idx + 1
        
yards = numpy.transpose(yards)
yr = numpy.transpose(yr)
plyr = numpy.transpose(plyr)

yards = pd.Series(yards)
yr = pd.Series(yr)
plyr = pd.Series(plyr)


data = {"Player": plyr,
        "Year": yr,
        "Yds": yards}

dffNew = pd.concat(data,
                   axis = 1)

相关问题 更多 >