<p>我同意@bereal的观点,您应该使用<code>Counter</code>。我知道你说过你不想要“进口、dict或zips”,所以你可以忽略这个答案。然而,Python的一个主要优点是它的标准库,每次你有<code>list</code>可用时,你也会有<code>dict</code>、<code>collections.Counter</code>和{<cd5>}。在</p>
<p>从您的代码中我得到的印象是,您希望使用与C或Java相同的样式。我建议你多做一点。以这种方式编写的代码可能看起来不太熟悉,而且需要时间来适应。不过,你会学到更多。在</p>
<p>你想要达到的目标会有所帮助。你在学Python吗?你在解决这个具体问题吗?你为什么不能用进口货,dict或zips?在</p>
<p>因此,这里有一个利用内置功能(没有第三方)的建议(使用Python2测试):</p>
<pre><code>#!/usr/bin/python
import re # String matching
import collections # collections.Counter basically solves your problem
def loadwords(s):
"""Find the words in a long string.
Words are separated by whitespace. Typical signs are ignored.
"""
return (s
.replace(".", " ")
.replace(",", " ")
.replace("!", " ")
.replace("?", " ")
.lower()).split()
def loadwords_re(s):
"""Find the words in a long string.
Words are separated by whitespace. Only characters and ' are allowed in strings.
"""
return (re.sub(r"[^a-z']", " ", s.lower())
.split())
# You may want to read this from a file instead
sourcefile_words = loadwords_re("""this is a sentence. This is another sentence.
Let's write many sentences here.
Here comes another sentence.
And another one.
In English, we use plenty of "a" and "the". A whole lot, actually.
""")
# Sets are really fast for answering the question: "is this element in the set?"
# You may want to read this from a file instead
keywords = set(loadwords_re("""
of and a i the
"""))
# Count for every word in sourcefile_words, ignoring your keywords
wordcount_all = collections.Counter(sourcefile_words)
# Lookup word counts like this (Counter is a dictionary)
count_this = wordcount_all["this"] # returns 2
count_a = wordcount_all["a"] # returns 1
# Only look for words in the keywords-set
wordcount_keywords = collections.Counter(word
for word in sourcefile_words
if word in keywords)
count_and = wordcount_keywords["and"] # Returns 2
all_counted_keywords = wordcount_keywords.keys() # Returns ['a', 'and', 'the', 'of']
</code></pre>