擅长:python、mysql、java
<p>由于<code>set</code>,您可以简单地使用vanilla python:</p>
<pre><code>In [129]: df
Out[129]:
Ligand_hit Ligand_miss
0 M00001 M00005
1 M00002 M00001
2 M00003 M00007
3 M00004 M00003
In [130]: pd.concat([df, pd.Series(list(set(df['Ligand_miss'].values) - set(df['Ligand_hit'].values)))], ignore_index=True, axis=1)
Out[130]:
0 1 2
0 M00001 M00005 M00007
1 M00002 M00001 M00005
2 M00003 M00007 NaN
3 M00004 M00003 NaN
</code></pre>
<p>一些解释:</p>
<ul>
<li><p><code>set(df['Ligand_miss'].values)</code>和<code>set(df['Ligand_hit'].values)</code>获得这两列中的唯一值。</p></li>
<li><p><code>set(...) - set(...)</code>根据您的要求计算差异(“唯一”)。</p></li>
<li><p><a href="https://pandas.pydata.org/pandas-docs/stable/generated/pandas.concat.html" rel="nofollow noreferrer">^{<cd5>}</a>将结果合并到原始数据帧中。</p></li>
</ul>