<p>这是基于这样一个假设:每个<code>gameID</code>正好有两行,并且希望按该ID分组(它还假设我理解这个问题)</p>
<p><strong>改进的解决方案</strong></p>
<p>给定一个数据帧<code>df</code>,例如</p>
<pre><code> gameID Won/Lost Home Away metric2 metric3 metric4 team1 team2 team3 team4
0 2017020001 1 1 0 10 10 10 1 0 0 0
1 2017020001 0 0 1 10 10 10 0 1 0 0
2 2017020002 1 1 0 10 10 10 1 0 0 0
3 2017020002 0 0 1 10 10 10 0 1 0 0
</code></pre>
<p>您可以使用<code>pd.merge</code>(和一些数据咀嚼)如下:</p>
^{pr2}$
<p>(我保留了<code>Won/Lost</code>的前缀,因为它表示这是主队的统计数据。另外,如果有人知道如何更优雅地添加前缀而不必重新命名<code>gameID</code>,请留言。)</p>
<hr/>
<p><strong>原始尝试</strong></p>
<p>分组后可以应用以下函数</p>
<pre><code>def munge(group):
is_home = group.Home == 1
wonlost = group.loc[is_home, 'Won/Lost'].reset_index(drop=True)
group = group.loc[:, 'metric2':]
home = group[is_home].add_prefix('h_').reset_index(drop=True)
away = group[~is_home].add_prefix('a_').reset_index(drop=True)
return pd.concat([wonlost, home, away], axis=1)
</code></pre>
<p>。。。像这样:</p>
<pre><code>>>> df.groupby('gameID').apply(munge).reset_index(level=1, drop=True)
Won/Lost h_metric2 h_metric3 h_metric4 h_team1 h_team2 h_team3 h_team4 a_metric2 a_metric3 a_metric4 a_team1 a_team2 a_team3 a_team4
gameID
2017020001 1 10 10 10 1 0 0 0 10 10 10 0 1 0 0
2017020002 1 10 10 10 1 0 0 0 10 10 10 0 1 0 0
</code></pre>