<p>将数据示例转换为csv文件,我们可以执行以下操作:</p>
<pre><code>import pandas as pd
def grouping_Cols_by_Cols(DF, grouping_Columns, num_Columns):
# numerical columns can mess us up ...
column_Names = DF.columns.tolist()
# so, convert all columns' values to strings
for column_Name in column_Names:
DF[column_Name] = DF[column_Name].map(str) + ' '
DF = DF.groupby(by=grouping_Columns).sum()
# NOW, convert the numerical string columns to an expression ...
for num_Col in num_Columns:
column_Names = DF.columns.tolist()
num_Col_i = column_Names.index(num_Col)
for i in range(len(DF)):
String = DF[num_Col].iloc[i]
value = eval(String.rstrip(' ').replace(' ','+'))
DF.iat[i,num_Col_i] = value
return DF
###############################################################
### Operations Section
###############################################################
df = pd.read_csv("UnCombinedData.csv")
grouping_Columns = ['ID','Name']
num_Columns = ['NUM']
df = grouping_Cols_by_Cols(df,grouping_Columns, num_Columns)
print df
</code></pre>
<p>再做一点工作,定义的函数就可以自动检测哪些列中有数字,并将它们添加到数字列列表中。</p>
<p>我认为这与<a href="https://stackoverflow.com/questions/47515586/concatenate-several-columns-across-more-than-one-row-in-pandas/47516549?noredirect=1#comment82125486_47516549">this post</a>中遇到的问题和挑战相似,但并不完全相同。</p>