<p>根据您的描述,让我们命名两列<code>name</code>和<code>value</code>。所以您需要找到值包含<code>+</code>和<code>-</code>符号的名称列表。然后找到值不包含这些符号的名称列表。然后找到这两个列表的交叉点,例如,找到上面两个列表中出现的姓名的最终列表。然后您需要过滤名称出现在最终列表中的原始数据帧</p>
<pre><code>import pandas as pd
import io
data = """Cluster1 NP_075076
Cluster1 AMN16433
Cluster1 YP_063711
Cluster1 KQ976470.1:66008-66163(-):Cattus_sylvestris
Cluster1 AJP07295
Cluster1 AMN15329
Cluster2 YP_00999
Cluster2 YP_00989
Cluster2 YP_00971
Cluster2 YP_00988
Cluster2 AJP07295
Cluster3 KI976478.1:66021-66123(-):Canis_lupus
Cluster3 AJP07232
Cluster3 AJP07212
Cluster3 AZ976430.1:66045-66190(+):Cavia_porsellus
Cluster4 AHHYUIIY
Cluster5 AZ976490:66042-66190(-):Felis_porsellus
Cluster5 AA976490:66021-66130(+):Felis_porsellus"""
df = pd.read_csv(io.StringIO(data), sep="\s+", header=None)
df.columns = ["name", "value"]
list1 = df.loc[df.value.str.contains("[+-]")].name.unique()
list2 = df.loc[~df.value.str.contains("[+-]")].name.unique()
final_list = set(list1).intersection(set(list2))
>>> df.loc[df.name.isin(final_list)]
name value
0 Cluster1 NP_075076
1 Cluster1 AMN16433
2 Cluster1 YP_063711
3 Cluster1 KQ976470.1:66008-66163(-):Cattus_sylvestris
4 Cluster1 AJP07295
5 Cluster1 AMN15329
11 Cluster3 KI976478.1:66021-66123(-):Canis_lupus
12 Cluster3 AJP07232
13 Cluster3 AJP07212
14 Cluster3 AZ976430.1:66045-66190(+):Cavia_porsellus
</code></pre>