<p>您可以使用<code>pandas</code>轻松地做到这一点</p>
<p>首先,在制表符分隔符上将字符串拆分为一个列表,然后遍历该列表并选择长度大于2的字符串,因为您希望棒球和其他两项运动作为标准。你知道吗</p>
<pre><code>In [4]: df['Gym'] = df['Gym'].str.split('|').apply(lambda x: ' '.join([i for i in x if len(x)>2]))
In [5]: df
Out[5]:
Total Gym
0 40 Football Baseball Hockey Running Basketball Sw...
1 37
2 61 Basketball Baseball Ballet
3 12 Swimming Ballet Cycling Basketball Volleyball ...
4 78
5 29 Baseball Tennis Ballet Cycling Basketball Foot...
6 31
7 54 Tennis Football Ballet Cycling Running Swimmin...
8 33 Baseball Hockey Swimming Cycling
9 17 Football Hockey Volleyball
</code></pre>
<p>使用<code>str.contains</code>在<code>Gym</code>列中搜索字符串<code>Baseball</code>。你知道吗</p>
<pre><code>In [6]: df = df.loc[df['Gym'].str.contains('Baseball')]
In [7]: df
Out[7]:
Total Gym
0 40 Football Baseball Hockey Running Basketball Sw...
2 61 Basketball Baseball Ballet
3 12 Swimming Ballet Cycling Basketball Volleyball ...
5 29 Baseball Tennis Ballet Cycling Basketball Foot...
7 54 Tennis Football Ballet Cycling Running Swimmin...
8 33 Baseball Hockey Swimming Cycling
</code></pre>
<p>计算各自的字符串计数。你知道吗</p>
<pre><code>In [8]: df['Count'] = df['Gym'].str.split().apply(lambda x: len([i for i in x]))
</code></pre>
<p>然后选择与<code>Totals</code>列中的最大值对应的数据帧子集。你知道吗</p>
<pre><code>In [9]: df.loc[df['Total'].idxmax()]
Out[9]:
Total 61
Gym Basketball Baseball Ballet
Count 3
Name: 2, dtype: object
</code></pre>