回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我有两个数据帧:</p>
<pre><code>data = {
'values': ['Cricket', 'Soccer', 'Football', 'Tennis', 'Badminton', 'Chess'],
'gems': ['A1K, A2M, JA3, AN4', 'B1, A1, Bn2, B3', 'CD1, A1', 'KWS, KQM', 'JP, CVK', 'KF, GF']
}
df1 = pd.DataFrame(data)
</code></pre>
<p>df1</p>
<pre><code> values gems
0 Cricket A1K, A2M, JA3, AN4
1 Soccer B1, A1, Bn2, B3
2 Football CD1, A1
3 Tennis KWS, KQM
4 Badminton JP, CVK
5 Chess KF, GF
</code></pre>
<p>第二数据帧</p>
<pre><code>data2 = {
'1C': ['B1', 'K1', 'A1K', 'J1', 'A4'],
'02C': ['Bn2', 'B3', 'JK', 'ZZ', 'ko'],
'34C': ['KF', 'CD1', 'B3','ji', 'HU']
}
df2 = pd.DataFrame(data2)
</code></pre>
<p>df2</p>
<pre><code> 1C 02C 34C
0 B1 Bn2 KF
1 K1 B3 CD1
2 A1K JK B3
3 J1 ZZ ji
4 A4 ko HU
</code></pre>
<p>我希望检查<code>df1['gems']</code>中<code>df2</code>每列中的<code>df1['gems']</code>项,并表示它们的计数和重叠项。预期产出为:</p>
<pre><code> values gems 1C 1CGroup 02C 02CGroup 34C 34CGroup
0 Cricket A1K, A2M, JA3, AN4 1 A1K 0 NA 0 NA
1 Soccer B1, A1, Bn2, B3 1 Bn2 2 Bn2, B3 1 B3
2 Football CD1, A1 0 NA 0 NA 1 CD1
3 Tennis KWS, KQM 0 NA 0 NA 0 NA
4 Badminton JP, CVK 0 NA 0 NA 0 NA
5 Chess KF, GF 0 NA 0 NA 1 KF
</code></pre>