<p>遍历<code>nltk.Text</code>对象返回一个字符串列表,每个字符串都是一个单词,如果对列表中的所有字符串应用相同的操作,那么使用<code>map()</code>可能是一个好主意。你知道吗</p>
<pre><code>>>> from nltk.book import *
>>> text1_lowered = list(map(str.lower, text1))
>>> text1_lowered.count('whale')
1226
>>> text1_lowered.count('Whale')
0
>>> text1.count('Whale') + text1.count('whale')
1188
</code></pre>
<p>为了解答其他“鲸鱼”从何而来的奥秘,我们得到1226条:</p>
<pre><code>>>> from collections import Counter
>>> Counter([word for word in text1 if word.lower() == 'whale'])
Counter({'whale': 906, 'Whale': 282, 'WHALE': 38})
</code></pre>
<hr/>
<p>关于@axiom生成所有可能的“whale”大小写组合的想法,请参见<a href="https://stackoverflow.com/questions/20063721/string-manipulation-in-python-all-upper-and-lower-case-derivatives-of-a-word">String manipulation in Python (All upper and lower case derivatives of a word)</a></p>
<pre><code>>>> from itertools import product
>>> cRaZySpe3K = lambda s: [''.join(x) for x in product(*[{c.upper(), c} for c in s.lower()])]
>>> cRaZySpe3K('whale')
['WHALe', 'WHALE', 'WHAle', 'WHAlE', 'WHaLe', 'WHaLE', 'WHale', 'WHalE', 'WhALe', 'WhALE', 'WhAle', 'WhAlE', 'WhaLe', 'WhaLE', 'Whale', 'WhalE', 'wHALe', 'wHALE', 'wHAle', 'wHAlE', 'wHaLe', 'wHaLE', 'wHale', 'wHalE', 'whALe', 'whALE', 'whAle', 'whAlE', 'whaLe', 'whaLE', 'whale', 'whalE']
>>> {whale:text1.count(whale) for whale in cRaZySpe3K('whale')}
{'WHALe': 0, 'WHALE': 38, 'WHAle': 0, 'WHAlE': 0, 'WHaLe': 0, 'WHaLE': 0, 'WHale': 0, 'WHalE': 0, 'WhALe': 0, 'WhALE': 0, 'WhAle': 0, 'WhAlE': 0, 'WhaLe': 0, 'WhaLE': 0, 'Whale': 282, 'WhalE': 0, 'wHALe': 0, 'wHALE': 0, 'wHAle': 0, 'wHAlE': 0, 'wHaLe': 0, 'wHaLE': 0, 'wHale': 0, 'wHalE': 0, 'whALe': 0, 'whALE': 0, 'whAle': 0, 'whAlE': 0, 'whaLe': 0, 'whaLE': 0, 'whale': 906, 'whalE': 0}
</code></pre>