<p><strong>1。列聚合</strong></p>
<p>我认为您需要<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.GroupBy.apply.html" rel="nofollow noreferrer">^{<cd1>}</a>和<code>,.join</code>,然后对于变更单,使用双<code>[[]]</code>:</p>
<pre><code>df = df1.groupby(["City"])['Name'].apply(','.join).reset_index()
df = df[['Name','City']]
print (df)
Name City
0 Mallory,Mallory Portland
1 Alice,Bob,Mallory,Bob Seattle
</code></pre>
<p>因为<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.GroupBy.transform.html" rel="nofollow noreferrer">^{<cd4>}</a>使用聚合值创建新列:</p>
^{pr2}$
<p><strong>2。列和更多聚合</strong></p>
<p>如果更多的列需要在<code>[]</code>中使用指定列的<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.DataFrameGroupBy.agg.html" rel="nofollow noreferrer">^{<cd5>}</a>或不指定join all string列:</p>
<pre><code>df1 = pd.DataFrame( {
"Name" : ["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"] ,
"Name2": ["Alice1", "Bob1", "Mallory1", "Mallory1", "Bob1" , "Mallory1"],
"City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle",
"Portland"] } )
print (df1)
City Name Name2
0 Seattle Alice Alice1
1 Seattle Bob Bob1
2 Portland Mallory Mallory1
3 Seattle Mallory Mallory1
4 Seattle Bob Bob1
5 Portland Mallory Mallory1
df = df = df1.groupby('City')['Name', 'Name2'].agg(','.join).reset_index()
print (df)
City Name Name2
0 Portland Mallory,Mallory Mallory1,Mallory1
1 Seattle Alice,Bob,Mallory,Bob Alice1,Bob1,Mallory1,Bob1
</code></pre>
<p>如果需要,聚合所有列:</p>
<pre><code>df = df1.groupby('City').agg(','.join).reset_index()
print (df)
City Name Name2
0 Portland Mallory,Mallory Mallory1,Mallory1
1 Seattle Alice,Bob,Mallory,Bob Alice1,Bob1,Mallory1,Bob1
</code></pre>
<hr/>
<pre><code>df1 = pd.DataFrame( {
"Name" : ["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"] ,
"Name2": ["Alice1", "Bob1", "Mallory1", "Mallory1", "Bob1" , "Mallory1"],
"City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland"],
'Numbers':[1,5,4,3,2,1]} )
print (df1)
City Name Name2 Numbers
0 Seattle Alice Alice1 1
1 Seattle Bob Bob1 5
2 Portland Mallory Mallory1 4
3 Seattle Mallory Mallory1 3
4 Seattle Bob Bob1 2
5 Portland Mallory Mallory1 1
df = df1.groupby('City').agg({'Name': ','.join,
'Name2': ','.join,
'Numbers': 'max'}).reset_index()
print (df)
City Name Name2 Numbers
0 Portland Mallory,Mallory Mallory1,Mallory1 4
1 Seattle Alice,Bob,Mallory,Bob Alice1,Bob1,Mallory1,Bob1 5
</code></pre>