文本Fi的Python字数

def repeatedWords(): with open(fname) as f: wordcount={} for word in word_list: for word in f.read().split(): if word not in wordcount: wordcount[word] = 1 else: wordcount[word] += 1 for k,v in wordcount.items(): print k, v word_list = [‘Emma’, ‘Woodhouse’, ‘father’, ‘Taylor’, ‘Miss’, ‘been’, ‘she’, ‘her’] repeatedWords('file.txt')

2条回答

网友

1楼 · 编辑于 2024-10-01 09:27:02

最好的处理方法是在Python字典中使用get方法。可以是这样的：

def repeatedWords():
with open(fname) as f:
    wordcount = {}
    #Example list of words not needed
    nonwordlist = ['father', 'Miss', 'been']
    for word in word_list:
        for word in file.read().split():
            if not word in nonwordlist:
                wordcount[word] = wordcount.get(word, 0) + 1


# Put these outside the function repeatedWords
for k,v in wordcount.items():
    print k, v

打印声明应提供以下信息：

^{pr2}$

这行wordcount[word] = wordcount.get(word, 0) + 1所做的是，它首先在字典wordcount中查找{}，如果这个单词已经存在，它首先得到它的值并将1加到它上面。如果word不存在，则该值默认为0，并且在这个实例中，1被添加，使其成为该单词的第一次出现，计数为1。在

网友

2楼 · 编辑于 2024-10-01 09:27:02

所以你只需要列表中特定单词的出现频率（艾玛，伍德豪斯，父亲？如果是这样，此代码可能会有所帮助（请尝试运行）：

    word_list = ['Emma','Woodhouse','father','Taylor','Miss','been','she','her']
    #i'm using this example text in place of the file you are using
    text = 'This is an example text. It will contain words you are looking for, like Emma, Emma, Emma, Woodhouse, Woodhouse, Father, Father, Taylor,Miss,been,she,her,her,her. I made them repeat to show that the code works.'
    text = text.replace(',',' ') #these statements remove irrelevant punctuation
    text = text.replace('.','')
    text = text.lower() #this makes all the words lowercase, so that capitalization wont affect the frequency measurement

    for repeatedword in word_list:
        counter = 0 #counter starts at 0
        for word in text.split():
            if repeatedword.lower() == word:
                counter = counter + 1 #add 1 every time there is a match in the list
        print(repeatedword,':', counter) #prints the word from 'word_list' and its frequency

输出只显示您提供的列表中这些单词的频率，这就是您想要的对吗？在

在python3中运行时产生的输出是：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章