对于给定的字符串列表,计算单词数

2024-06-01 12:07:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个给定的字符串列表,它是:

strings = ["the method of lagrange multipliers is the economists workhorse for solving optimization problems", "the technique is a centerpiece of economic theory but unfortunately its usually taught poorly"]

现在我想找出每个句子中的单词号,这样我的输出就会

{'the': 2, 'method': 1, 'of': 1, 'lagrange': 1, 'multipliers': 1, 'is': 1, 'economists': 1, 'workhorse': 1, 'for': 1, 'solving': 1, 'optimization': 1, 'problems': 1}

{'the': 1, 'technique': 1, 'is': 1, 'a': 1, 'centerpiece': 1, 'of': 1, 'economic': 1, 'theory': 1, 'but': 1, 'unfortunately': 1, 'its': 1, 'usually': 1, 'taught': 1, 'poorly': 1}

我的代码如下:

from collections import Counter

dataset = ["the method of lagrange multipliers is the economists workhorse for solving optimization problems",
           "the technique is a centerpiece of economic theory but unfortunately its usually taught poorly"]


for index,row in enumerate(dataset):
    word_frequency = dict(Counter(row.split(" ")))
    
print(word_frequency)

通过这个,我得到的输出是:

{'the': 1, 'technique': 1, 'is': 1, 'a': 1, 'centerpiece': 1, 'of': 1, 'economic': 1, 'theory': 1, 'but': 1, 'unfortunately': 1, 'its': 1, 'usually': 1, 'taught': 1, 'poorly': 1}

很明显,这只是考虑第二句话并计算它,而不是第一句话

有人能帮我理解代码中的错误吗


Tags: oftheforismethodbutitstheory
3条回答

在循环内打印,而不是在循环内设置变量(在每次迭代结束时打印之前,该变量将被覆盖):

>>> dataset = ["the method of lagrange multipliers is the economists workhorse for solving optimization problems",
...            "the technique is a centerpiece of economic theory but unfortunately its usually taught poorly"]
>>> from collections import Counter
>>> for sentence in dataset:
...     print(dict(Counter(sentence.split())))
...
{'the': 2, 'method': 1, 'of': 1, 'lagrange': 1, 'multipliers': 1, 'is': 1, 'economists': 1, 'workhorse': 1, 'for': 1, 'solving': 1, 'optimization': 1, 'problems': 1}
{'the': 1, 'technique': 1, 'is': 1, 'a': 1, 'centerpiece': 1, 'of': 1, 'economic': 1, 'theory': 1, 'but': 1, 'unfortunately': 1, 'its': 1, 'usually': 1, 'taught': 1, 'poorly': 1}

正在使用数据集列表中的每个字符串更新word_frequency。最后,它存储数据集中最后一个字符串的计数器。因此,显示最后一个字符串中单词的计数器。您可以在for循环中使用print(word_frequency),也可以使用list ,每次都将word_frequency附加到列表中,一旦退出循环,只需打印list

from collections import Counter

dataset = ["the method of lagrange multipliers is the economists workhorse for solving optimization problems",
           "the technique is a centerpiece of economic theory but unfortunately its usually taught poorly"]

l = []
for index,row in enumerate(dataset):
    word_frequency = dict(Counter(row.split(" ")))
    l.append(word_frequency)
    
print(l)

只需将打印移动到for循环中,即可覆盖计算的单词频率参数

相关问题 更多 >