擅长:python、mysql、java
<p>有一种方法可以计算标准库中的对象,称为<a href="https://docs.python.org/2/library/collections.html#collections.Counter" rel="nofollow">^{<cd1>}</a>。
另外,在<a href="https://docs.python.org/2/library/itertools.html#recipes" rel="nofollow">^{<cd2>}</a>的帮助下,bigram计数器脚本可以如下所示:</p>
<pre><code>from collections import Counter, defaultdict
from itertools import izip, tee
#function from 'recipes section' in standard documentation itertools page
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return izip(a, b)
text = [{'ideology': 3.4, 'ID': '50555',
'reviews': 'Politician from CA-21, very liberal and aggressive'},
{'ideology': 1.5, 'ID': '10223',
'reviews': 'Retired politician'} ]
c = Counter()
for l in text:
c.update(pairwise(l['reviews'].split()))
print c.items()
</code></pre>