我想为足球队计算分数。我有每场比赛的积分,我得到主客场积分的总和。我不知道如何得到每个队的总分(主客场分)
这就是我目前的情况:
df = pd.DataFrame([
["Gothenburg", "Malmo", 2018, 1, 1],
["Malmo","Gothenburg", 2018, 1, 1],
["Malmo", "Gothenburg", 2018, 0, 3],
["Gothenburg", "Malmo", 2018, 1, 1],
["Gothenburg", "Malmo" ,2018, 0, 3],
["Gothenburg", "Malmo", 2018, 1, 1],
["Gothenburg", "Malmo", 2018, 0, 3],
["Malmo", "Gothenburg", 2018, 0, 3],
["Gothenburg", "Malmo", 2018, 1, 1],
["Malmo", "Gothenburg", 2018, 0, 3],
[ "Malmo","Gothenburg", 2018, 1, 1],
[ "Malmo", "Gothenburg",2018, 0, 3],
])
df.columns = ['H_team', 'A_team', "Year", 'H_points', 'A_points']
# Cumulaive sum for home/ away team with shift 1 row
df["H_cumsum"] = df.groupby(['H_team', "Year"])['H_points'].transform(
lambda x: x.cumsum().shift())
df["A_cumsum"] = df.groupby(['A_team', "Year"])['A_points'].transform(
lambda x: x.cumsum().shift())
print(df)
H_team A_team Year H_points A_points H_cumsum A_cumsum
0 Gothenburg Malmo 2018 1 1 NaN NaN
1 Malmo Gothenburg 2018 1 1 NaN NaN
2 Malmo Gothenburg 2018 0 3 1.0 1.0
3 Gothenburg Malmo 2018 1 1 1.0 1.0
4 Gothenburg Malmo 2018 0 3 2.0 2.0
5 Gothenburg Malmo 2018 1 1 2.0 5.0
6 Gothenburg Malmo 2018 0 3 3.0 6.0
7 Malmo Gothenburg 2018 0 3 1.0 4.0
8 Gothenburg Malmo 2018 1 1 3.0 9.0
9 Malmo Gothenburg 2018 0 3 1.0 7.0
10 Malmo Gothenburg 2018 1 1 1.0 10.0
11 Malmo Gothenburg 2018 0 3 2.0 11.0
这张表给了我每个队的累计主场和客场积分,每排一次。但我需要主客场比赛的总得分。库姆苏姆和阿库姆苏姆应该加上之前主客场比赛的积分。你知道吗
期望输出:
row 0: Malmo = NaN, Gothenburg = NaN
row 1: Gothenburg = 1, Malmo = 1
row 2: Malmo = 1 + 1 = 2, Gothenburg = 1 + 1 = 2
row 3: Gothenburg = 1 + 1 + 3 = 5, Malmo = 1 + 1 + 0 = 2
row 4: Gothenburg = 1 + 1 + 3 + 1 = 6, Malmo = 1 + 1 + 0 + 1 = 3
And so on...
最后一行11应该是:
H_cumsum (team Malmo) = 12 H_cumsum (team Gothenburg) = 15
我找到了一个解决方案,使用stack,但不是一个好的解决方案:
道达尔/客场和道达尔/主场下的分数是正确的。但是,使用所有额外的不必要列来概述表变得非常困难。(本例中没有显示的每一行有另外10列,因此非常混乱。)
所需输出为:
在我这一方,这似乎算是不错的。这是一个有点长的手。你知道吗
相关问题 更多 >
编程相关推荐