擅长:python、mysql、java
<p>一种想法是通过<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.view.html" rel="nofollow noreferrer">^{<cd2>}</a>将<code>new</code>列转换为整数,然后通过指定新列名称的元组列表将<code>new</code>列与<code>size</code>和<code>sum</code>聚合:</p>
<pre><code>df1['new'] = (df1['PredictedFeature'] == df2['PredictedFeature']).view('i1')
df = (df1.groupby("PredictedFeature")['new']
.agg([('inputCsvOccured','size'), ('outputcsvmatched','sum')])
.reset_index())
print (df)
PredictedFeature inputCsvOccured outputcsvmatched
0 2000 2 1
1 2100 3 1
2 2200 3 1
</code></pre>
<p>0.25+溶液:</p>
<pre><code>df1['new'] = (df1['PredictedFeature'] == df2['PredictedFeature']).view('i1')
df = (df1.groupby("PredictedFeature")
.agg(inputCsvOccured=pd.NamedAgg(column='new', aggfunc='size'),
outputcsvmatched=pd.NamedAgg(column='new', aggfunc='sum'))
.reset_index())
</code></pre>