擅长:python、mysql、java
<p>这是一个使用<code>itertools.chain</code>和<code>collections.Counter</code>的解决方案:</p>
<pre><code>import pandas as pd
from collections import Counter
from itertools import chain
s = pd.Series(['This is an example #tag1',
'This too is an example #tag1 #tag2',
'Yup, still an example #tag1 #tag1 #tag3'])
tags = s.map(lambda x: {i[1:] for i in x.split() if i.startswith('#')})
res = Counter(chain.from_iterable(tags))
print(res)
Counter({'tag1': 3, 'tag2': 1, 'tag3': 1})
</code></pre>
<p><strong>绩效基准</strong></p>
<p>对于大型系列,<code>collections.Counter</code>的速度是<code>pd.Series.str.extractall</code>的2倍:</p>
^{pr2}$