查找只出现在

2024-10-01 15:37:04 发布

您现在位置:Python中文网/ 问答频道 /正文

我只在一个文件中检索唯一的单词,这是我迄今为止所拥有的,但是在python中,有没有更好的方法来实现big O表示法呢?现在这是n的平方

def retHapax():
    file = open("myfile.txt")
    myMap = {}
    uniqueMap = {}
    for i in file:
        myList = i.split(' ')
        for j in myList:
            j = j.rstrip()
            if j in myMap:
                del uniqueMap[j]
            else:
                myMap[j] = 1
                uniqueMap[j] = 1
    file.close()
    print uniqueMap

Tags: 文件方法infordefopen单词myfile
3条回答

如果您想找到所有唯一的单词并考虑foofoo.相同,并且需要去掉标点符号。在

from collections import Counter
from string import punctuation

with open("myfile.txt") as f:
    word_counts = Counter(word.strip(punctuation) for line in f for word in line.split())

print([word for word, count in word_counts.iteritems() if count == 1])

如果要忽略大小写,还需要使用line.lower()。如果你想准确地得到唯一的单词,那么就不仅仅是在空白处拆分行。在

尝试使用此方法在文件.使用Counter

from collections import Counter
with open("myfile.txt") as input_file:
    word_counts = Counter(word for line in input_file for word in line.split())
>>> [word for (word, count) in word_counts.iteritems() if count==1]
-> list of unique words (words that appear exactly once)

我会使用collections.Counter方法,但是如果您只想使用sets,那么您可以通过以下方式实现:

with open('myfile.txt') as input_file:
    all_words = set()
    dupes = set() 
    for word in (word for line in input_file for word in line.split()):
        if word in all_words:
            dupes.add(word)
        all_words.add(word)

    unique = all_words - dupes

给定输入:

^{pr2}$

输出为:

{'five', 'one', 'six'}

相关问题 更多 >

    热门问题