<p>解决这个问题还有很长的路要走,来说明<code>groupby</code>是如何工作的</p>
<p><strong>首先创建一个函数,用于测试所需字符串:</strong></p>
<pre><code>def contains_str(x, string = '_Lh'):
if string in x:
return True
else:
return False
</code></pre>
<p><strong>接下来,迭代您的组并应用此函数:</strong></p>
<pre><code>keep_dict = {}
for label, group_df in df.groupby('col1'):
keep = group_df['col2'].apply(contains_str).any()
keep_dict[label] = keep
print(keep_dict)
# {'G1': True, 'G2': False, 'G3': False, 'G4': True}
</code></pre>
<blockquote>
<p>Feel free to print individual items in the operation to understand their role.</p>
</blockquote>
<p><strong>最后,将该词典映射到您当前的df:</strong></p>
<pre><code>df_final = df[df['col1'].map(keep_dict)].reset_index(drop=True)
col1 col2
0 G1 OP2
1 G1 OP0
2 G1 OPP
3 G1 OPL_Lh
4 G4 TUI
5 G4 TYUI
6 G4 TR_Lh
</code></pre>
<hr/>
<p><strong>您可以使用以下代码压缩这些步骤:</strong></p>
<pre><code>keep_dict = df.groupby('col1', as_index=True)['col2'].apply(lambda arr: any([contains_str(x) for x in arr])).to_dict()
print(keep_dict)
# {'G1': True, 'G2': False, 'G3': False, 'G4': True}
</code></pre>
<blockquote>
<p>I hope this both answers your Q and explains what's taking place "behind the scenes" in groupby operations.</p>
</blockquote>