胜负团队的累计积分总和?

2024-09-28 17:22:35 发布

您现在位置:Python中文网/ 问答频道 /正文

因此,我有多个体育数据库,并希望在同一文本的两列中累积目标总和,而不仅仅是一列。特别是:

Year    team.Win    points.Win  team.Lose   points.Lose
1982    SUNY Cortland   2   Boston College  0
1982    Massachusetts   3   Rochester (NY)  1
1982    Princeton       1   George Mason    0
1982    Harvard         3   Brown           1
1982    Connecticut     2   SUNY Cortland   0
1982    UCF             2   Massachusetts   1
1982    North Carolina  4   Princeton       0
1982    Mo.-St. Louis   2   Harvard         1
1982    UCF             3   Connecticut     1
1982    North Carolina  2   Mo.-St. Louis   1
1982    Connecticut     2   Mo.-St. Louis   1
1982    North Carolina  2   UCF             0 

应该是

Year    team.Win    points.Win  team.Lose   points.Lose
1982    SUNY Cortland   2   Boston College          0
1982    Massachusetts   3   Rochester (NY)          1
1982    Princeton       1   George Mason            0
1982    Harvard         3   Brown                   1
1982    Connecticut     2   SUNY Cortland           2
1982    UCF             2   Massachusetts           4
1982    North Carolina  4   Princeton               1
1982    Mo.-St. Louis   2   Harvard                 4
1982    UCF             5   Connecticut             3
1982    North Carolina  6   Mo.-St. Louis           3
1982    Connecticut     5   Mo.-St. Louis           4
1982    North Carolina  8   UCF                     5

这是我第一次使用stackoverflow!很抱歉,如果格式不符合网站的要求。我问这个问题是因为我不想运行一个代码,把所有的数字都放在下一列中,然后从那里累积起来。我只想照原样做

我有R方面的背景,但我一直在发展Python方面的技能。我更喜欢前者。我对dplyr有些熟悉

编辑:我不希望这是一年!只需遍历所有行,并根据文本值(团队)进行累积求和(目标)


Tags: winteampointsmostprincetonnorthlouis
1条回答
网友
1楼 · 发布于 2024-09-28 17:22:35

下面是一种在R中使用tidyverse的方法

这种方法包括将数据转换成长格式,以便团队计算累积分数(包括赢和输的分数)

在那之后,把它放回大范围内。我希望这是有帮助的

library(tidyverse)

df %>%
  mutate(rn = row_number()) %>%
  pivot_longer(cols = c(-Year, -rn), names_to = c(".value", "outcome"), names_pattern = "(\\w+).(\\w+)") %>%
  group_by(team) %>%
  mutate(cum_points = cumsum(points)) %>%
  pivot_wider(id_cols = c(Year, rn), names_from = c(outcome, outcome), values_from = c(team, cum_points), names_sep = ".") %>%
  select(Year, ends_with("Win"), ends_with("Lose"))

输出

# A tibble: 12 x 5
    Year team.Win       cum_points.Win team.Lose      cum_points.Lose
   <int> <chr>                   <int> <chr>                    <int>
 1  1982 SUNY_Cortland               2 Boston_College               0
 2  1982 Massachusetts               3 Rochester_(NY)               1
 3  1982 Princeton                   1 George_Mason                 0
 4  1982 Harvard                     3 Brown                        1
 5  1982 Connecticut                 2 SUNY_Cortland                2
 6  1982 UCF                         2 Massachusetts                4
 7  1982 North_Carolina              4 Princeton                    1
 8  1982 Mo.-St._Louis               2 Harvard                      4
 9  1982 UCF                         5 Connecticut                  3
10  1982 North_Carolina              6 Mo.-St._Louis                3
11  1982 Connecticut                 5 Mo.-St._Louis                4
12  1982 North_Carolina              8 UCF                          5

相关问题 更多 >