<p>首先创建一个组表:</p>
<pre><code>df3 = (pd.merge(df1['gems'].str.split(',\s+').explode().reset_index(),
df2.unstack().reset_index(level=0),
left_on='gems', right_on=0, how='left'
)
.pivot_table(index='index',
columns=['level_0'],
values='gems',
aggfunc=list)
)
</code></pre>
<p>输出:</p>
<pre><code>level_0 02C 1C 34C
index
0 NaN [A1K] NaN
1 [Bn2, B3] [B1] [B3]
2 NaN NaN [CD1]
5 NaN NaN [KF]
</code></pre>
<p>然后生成计数并将所有内容与原始表连接:</p>
<pre><code>pd.concat([df1,
pd.concat([df3.add_suffix('Group').applymap(lambda x: ','.join(x) if isinstance(x, list) else x),
df3.fillna('').applymap(len)],
axis=1).sort_index(axis=1)
], axis=1)
</code></pre>
<p>输出:</p>
<pre><code> values gems 02C 02CGroup 1C 1CGroup 34C 34CGroup
0 Cricket A1K, A2M, JA3, AN4 0.0 NaN 1.0 A1K 0.0 NaN
1 Soccer B1, A1, Bn2, B3 2.0 Bn2, B3 1.0 B1 1.0 B3
2 Football CD1, A1 0.0 NaN 0.0 NaN 1.0 CD1
3 Tennis KWS, KQM NaN NaN NaN NaN NaN NaN
4 Badminton JP, CVK NaN NaN NaN NaN NaN NaN
5 Chess KF, GF 0.0 NaN 0.0 NaN 1.0 KF
</code></pre>
<p>编辑:字符串联接和计数的备选方案</p>
<pre><code>df3 = (pd.merge(df1['gems'].str.split(',\s+').explode().reset_index(),
df2.unstack().reset_index(level=0),
left_on='gems', right_on=0, how='left'
)
.pivot_table(index='index',
columns=['level_0'],
values='gems',
aggfunc=', '.join)
)
pd.concat([df1,
pd.concat([df3.add_suffix('Group'),
df3.applymap(lambda x: x.count(',')+1 if isinstance(x, str) else 0)],
axis=1).sort_index(axis=1)
], axis=1)
</code></pre>