使用Pandas计算人头对人头统计数据

2024-06-28 20:29:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个如下所示的数据帧:

  home_team  away_team  score  home_goals  away_goals  winner
1  Arsenal    Chelsea    3-0        3          0       Arsenal
2  ManCity    Arsenal    1-1        1          1       draw
3  Chelsea    Arsenal    2-1        2          1       Chelsea
4  Arsenal    Chelsea    5-5        5          5       draw
5  Arsenal    ManCity    1-2        1          2       ManCity

我的问题是:我如何计算阿森纳对另一支球队的赢-平-负/正面交锋记录

潜在的预期结果可能如下所示:

   team      opponent  games_played  wins  draws  losses  goals_scored  goals_conceded
1  Arsenal   Chelsea        3          1     1      1          9              7
2  Arsenal   ManCity        2          0     1      1          2              3

非常感谢您的帮助。请注意,数据帧不是真实的(以防任何英超专家潜伏)


Tags: 数据home记录teamscorearsenaldrawgoals
2条回答

检查此代码:

import pandas as pd

df_in = pd.read_csv('data.csv')
df_out = pd.DataFrame(columns = ['team', 'opponent', 'games_played', 'wins', 'draws', 'losses', 'goals_scored', 'goals_conceded'])

team = 'Arsenal'

for index, row in df_in.iterrows():
    if row['home_team'] == team:
        opponent = row['away_team']
        if row['home_goals'] > row['away_goals']:
            win = 1
            draw = 0
            loss = 0
        elif row['home_goals'] < row['away_goals']:
            win = 0
            draw = 0
            loss = 1
        else:
            win = 0
            draw = 1
            loss = 0
        goals_scored = row['home_goals']
        goals_conceded = row['away_goals']
    else:
        opponent = row['home_team']
        if row['home_goals'] > row['away_goals']:
            win = 0
            draw = 0
            loss = 1
        elif row['home_goals'] < row['away_goals']:
            win = 1
            draw = 0
            loss = 0
        else:
            win = 0
            draw = 1
            loss = 0
        goals_scored = row['away_goals']
        goals_conceded = row['home_goals']

    games_played = 1



    if opponent not in df_out['opponent'].unique():
        match = pd.DataFrame({'team': team,
                              'opponent': opponent,
                              'games_played': games_played,
                              'wins': win,
                              'draws': draw,
                              'losses': loss,
                              'goals_scored': goals_scored,
                              'goals_conceded': goals_conceded},
                             index = [0])
        df_out = pd.concat([df_out, match], ignore_index = True)
    else:
        df_out.loc[df_out['opponent'] == opponent, 'games_played'] += games_played
        df_out.loc[df_out['opponent'] == opponent, 'wins'] += win
        df_out.loc[df_out['opponent'] == opponent, 'draws'] += draw
        df_out.loc[df_out['opponent'] == opponent, 'losses'] += loss
        df_out.loc[df_out['opponent'] == opponent, 'goals_scored'] += goals_scored
        df_out.loc[df_out['opponent'] == opponent, 'goals_conceded'] += goals_conceded

此代码将以df_in的形式加载数据,并使用所需的数据创建一个df_out
输出:

      team opponent games_played wins draws losses goals_scored goals_conceded
0  Arsenal  Chelsea            3    1     1      1            9              7
1  Arsenal  ManCity            2    0     1      1            2              3

首先,您需要复制数据并翻转主客场团队以获得 您想要的球队/对手风格的统计信息

这是因为每场比赛你需要数到两次,一次是赢家,一次是输家。复制df并翻转字段,然后使用df.concat将数据帧放在一起

现在您可以聚合

你应该在主场、客场和赢家三场比赛中获得积分。数一数这一步的路线和目标。使用df.groupby(dimensions).agg(metrics)

现在需要将索引重置回df,以便再次使用winner列。使用df.reset_index(inplace=True)完成此操作

一旦你有了这个,你就可以创建新的列win, loss, draw',在那里你可以将胜利者与主队列或静态字符串“draw”进行比较

您现在可以再次计算df并汇总赢/输/平局列

相关问题 更多 >