擅长:python、mysql、java
<p>您可以使用<code>itertools.chain</code>(展平“bigrams”列中的列表),然后使用<code>pd.value_counts</code>。你知道吗</p>
<pre><code>df = pd.DataFrame({'bigrams': [['(a, b)', '(c, d)'], ['(a, b)'], ['(a, b)', '(e, f)']]})
df
bigrams
0 [(a, b), (c, d)]
1 [(a, b)]
2 [(a, b), (e, f)]
pd.__version__
# '0.24.1'
</code></pre>
<p/>
<pre><code>from itertools import chain
n = 2 # Find the top N
pd.value_counts(list(chain.from_iterable(df['bigrams']))).index[:n].tolist()
# ['(a, b)', '(e, f)']
</code></pre>