擅长:python、mysql、java
<p>我会映射和分组:</p>
<pre><code>def get_similarity(df, ind, col):
mapped = list(map(lambda x: fuzz.ratio(x, df[col].loc[ind]), df[col]))
cond = (np.array(mapped) >= 70)
label = df[col][cond].iloc[0]
return label
</code></pre>
<p>使用如下:</p>
<pre><code>df.groupby(lambda x: get_similarity(df, x, 'a'))['b'].sum()
</code></pre>