如何根据列值合并数据帧中的行？问题的回答

如何根据列值合并数据帧中的行？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我有一个数据集，它的形状是这样的，其中每一行表示由<code>gameID</code>指定的特定匹配中的。在 <pre class="lang-none prettyprint-override"><code> gameID Won/Lost Home Away metric2 metric3 metric4 team1 team2 team3 team4 2017020001 1 1 0 10 10 10 1 0 0 0 2017020001 0 0 1 10 10 10 0 1 0 0 </code></pre> 我要做的是创建一个函数，它接受具有相同<code>gameID</code>的行并将它们联接起来。正如您在下面的数据示例中看到的，这两行代表一场比赛，它被分成主队（第1排）和客队（第2排）。我要这两排只能坐在一排。在 ^{pr2}$ 我怎么得到这个结果？在 编辑：我造成了太多的混乱，张贴我的代码，以便你能更好地掌握我想解决的问题。在 <pre><code>import numpy as np import pandas as pd import requests import json from sklearn import preprocessing from sklearn.preprocessing import OneHotEncoder results = [] for game_id in range(2017020001, 2017020010, 1): url = 'https://statsapi.web.nhl.com/api/v1/game/{}/boxscore'.format(game_id) r = requests.get(url) game_data = r.json() for homeaway in ['home','away']: game_dict = game_data.get('teams').get(homeaway).get('teamStats').get('teamSkaterStats') game_dict['team'] = game_data.get('teams').get(homeaway).get('team').get('name') game_dict['homeaway'] = homeaway game_dict['game_id'] = game_id results.<a href="https://www.cnpython.com/list/append" class="inner-link">append</a>(game_dict) df = pd.DataFrame(results) df['Won/Lost'] = df.groupby('game_id')['goals'].apply(lambda g: (g == g.max()).map({True: 1, False: 0})) df["faceOffWinPercentage"] = df["faceOffWinPercentage"].astype('float') df["powerPlayPercentage"] = df["powerPlayPercentage"].astype('float') df["team"] = df["team"].astype('category') df = pd.get_dummies(df, columns=['homeaway']) df = pd.get_dummies(df, columns=['team']) </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

这是基于这样一个假设：每个<code>gameID</code>正好有两行，并且希望按该ID分组（它还假设我理解这个问题） 改进的解决方案 给定一个数据帧<code>df</code>，例如 <pre><code> gameID Won/Lost Home Away metric2 metric3 metric4 team1 team2 team3 team4 0 2017020001 1 1 0 10 10 10 1 0 0 0 1 2017020001 0 0 1 10 10 10 0 1 0 0 2 2017020002 1 1 0 10 10 10 1 0 0 0 3 2017020002 0 0 1 10 10 10 0 1 0 0 </code></pre> 您可以使用<code>pd.merge</code>（和一些数据咀嚼）如下： ^{pr2}$ （我保留了<code>Won/Lost</code>的前缀，因为它表示这是主队的统计数据。另外，如果有人知道如何更优雅地添加前缀而不必重新命名<code>gameID</code>，请留言。） <hr/> 原始尝试 分组后可以应用以下函数 <pre><code>def munge(group): is_home = group.Home == 1 wonlost = group.loc[is_home, 'Won/Lost'].reset_index(drop=True) group = group.loc[:, 'metric2':] home = group[is_home].add_prefix('h_').reset_index(drop=True) away = group[~is_home].add_prefix('a_').reset_index(drop=True) return pd.concat([wonlost, home, away], axis=1) </code></pre> 。。。像这样： <pre><code>>>> df.groupby('gameID').apply(munge).reset_index(level=1, drop=True) Won/Lost h_metric2 h_metric3 h_metric4 h_team1 h_team2 h_team3 h_team4 a_metric2 a_metric3 a_metric4 a_team1 a_team2 a_team3 a_team4 gameID 2017020001 1 10 10 10 1 0 0 0 10 10 10 0 1 0 0 2017020002 1 10 10 10 1 0 0 0 10 10 10 0 1 0 0 </code></pre>

如何根据列值合并数据帧中的行？

1 个回答

相关Python问题