<p>用途:</p>
<pre><code>df = pd.DataFrame({'CODE':['000', '111','111','222','222', '333'],'NAME':['help','foo','bar', 'bla','booo','nyaa'] ,'ALT_NAME':['zzz','foo 1','bar', 'bl','bo','rrr'] })
print (df)
</code></pre>
<p>输出:</p>
<pre><code> ALT_NAME CODE NAME
0 zzz 000 help
1 foo 1 111 foo
2 bar 111 bar
3 bl 222 bla
4 bo 222 booo
5 rrr 333 nyaa
</code></pre>
<p>在我看来,最好是按<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.DataFrameGroupBy.agg.html" rel="nofollow noreferrer">^{<cd1>}</a>创建所有值的列表,但首先按<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.reset_index.html" rel="nofollow noreferrer">^{<cd3>}</a>列创建<code>index</code>:</p>
<pre><code>df1 = (df.set_index('CODE', drop=False)
.rename_axis(None)
.groupby('CODE')
.agg(list)
.reset_index(drop=True))
print (df1)
</code></pre>
<p>输出:</p>
<pre><code> ALT_NAME NAME
0 zzz help
1 [foo 1, bar] [foo, bar]
2 [bl, bo] [bla, booo]
3 rrr nyaa
</code></pre>
<p>但如果需要,可以在lambda函数中添加<code>if-else</code>:</p>
<pre><code>df1 = (df.set_index('CODE', drop=False)
.rename_axis(None)
.groupby(level=0)
.agg(lambda x: list(x) if len(x) > 1 else x)
.reset_index(drop=True))
print (df1)
</code></pre>
<p>输出:</p>
<pre><code> ALT_NAME CODE NAME
0 zzz 000 help
1 [foo 1, bar] [111, 111] [foo, bar]
2 [bl, bo] [222, 222] [bla, booo]
3 rrr 333 nyaa
</code></pre>