低效函数在短文本上加载，但在给定超长文本进行分析时无法加载

2条回答

网友

1楼 · 编辑于 2024-09-29 21:36:33

list.count具有O（n）复杂性。在循环中运行O（n）操作将特别低效。它至少具有复杂性O（m*n），其中m是唯一单词的数量

相反，您可以使用collections.Counter作为O（n）解决方案：

words = 'this is a test string of words containing repeated words within the string'

from collections import Counter

c = Counter(words.split())

res = c.most_common(5)

[('string', 2), ('words', 2), ('this', 1), ('is', 1), ('a', 1)]

网友

2楼 · 编辑于 2024-09-29 21:36:33

要计算文件中单词的频率，请使用计数器：

from collections import Counter
f=open ("file.txt","r") 
words=Counter(f.read().split())

这将提供一个字典输出，其中单词作为键，计数作为它们的值

如果您不想导入任何内容，那么我建议：

f=open("file.txt","r")
count={}
for eacword in f.read().split():
    if eacword not in count:
        count[eachword] = 1
    else:
        count[eachword] += 1

根据Nearo的建议，您可以通过以下方式避免if else：

f=open("file.txt","r")
count={}
for eacword in f.read().split():
    count[eachword]=count.get(eachword,0)+1

相关问题更多 >

编程相关推荐

热门问题

热门文章

低效函数在短文本上加载，但在给定超长文本进行分析时无法加载

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >