擅长:python、mysql、java
<p>让我们尝试使用<code>groupby</code>和<code>transform</code>,然后获得最常见值的计数,然后使用<code>drop_duplicates</code>和<code>sort_values</code></p>
<pre><code>df['help'] = df.groupby(['id','string_col_A','string_col_B'])['string_col_A'].transform('count')
out = df.sort_values(['help','creation_date'],na_position='first').drop_duplicates('id',keep='last').drop(['help','creation_date'],1)
out
Out[122]:
id string_col_A string_col_B
3 x21ab STR_X4 STR_Y4
5 x11aa STR_X3 STR_Y3
0 x12ga STR_X1 STR_Y1
</code></pre>