<p>熊猫支持来自<code>1.1</code>版本<a href="https://pandas.pydata.org/pandas-docs/dev/whatsnew/v1.1.0.html#allow-na-in-groupby-key" rel="nofollow noreferrer">link</a>的<code>groupby</code>中缺少的值</p>
<p>第一个想法是创建新的辅助列<code>new</code>,将缺少的值替换为某些字符串,例如<code>miss</code>,然后按<code>new</code>分组,按<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.GroupBy.agg.html" rel="nofollow noreferrer">^{<cd6>}</a>和<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.GroupBy.first.html" rel="nofollow noreferrer">^{<cd7>}</a>聚合,最后按第一个<code>reset_index</code>删除辅助级别:</p>
<pre><code>df = (df.assign(new= df['ColToKeep'].fillna('miss'))
.groupby(['User', 'new'], sort=False)
.agg({'Col1ToSum':'sum', 'Col2ToSum':'sum', 'ColToKeep':'first'})
.reset_index(level=1, drop=True)
.reset_index())
print (df)
User Col1ToSum Col2ToSum ColToKeep
0 ABC 40 650 1.015
1 ABA 180 100 2.240
2 AAA 60 20 NaN
3 BBB 10 15 NaN
4 XYZ 10 10 1.100
5 XYZ 10 10 1.500
</code></pre>
<p>另一个想法是将<code>miss</code>替换回<code>NaN</code>:</p>
<pre><code>df = (df.assign(ColToKeep = df['ColToKeep'].fillna('miss'))
.groupby(['User', 'ColToKeep'], sort=False)[['Col1ToSum', 'Col2ToSum']].sum()
.reset_index()
.replace({'ColToKeep': {'miss':np.nan}}))
print (df)
User ColToKeep Col1ToSum Col2ToSum
0 ABC 1.015 40 650
1 ABA 2.240 180 100
2 AAA NaN 60 20
3 BBB NaN 10 15
4 XYZ 1.100 10 10
5 XYZ 1.500 10 10
</code></pre>