在python中如何计算多个句子中的单词

网友

1楼 · 编辑于 2024-10-01 17:27:32

要获取每个项目对应于a句子的列表：

def count_words_per_sentence(filename):
    """
    :type filename: str
    :rtype: list[int]
    """
    with open(filename) as f:
        sentences = f.read().split('.')
    return [len(sentence.split()) for sentence in sentences]

要测试两个句子有多少共同的单词，你应该使用集合运算。例如：

^{pr2}$

对于文件io，请查看csv模块和writer函数。将您的行构造为一个列表列表——签出zip——然后将其馈送给csv编写器。在

word_counts_1 = count_words_per_sentence(filename_one)
word_counts_2 = count_words_per_sentence(filename_two)
in_common = count_words_in_common_per_sentence(filename_one, filename_two)
rows = zip(itertools.count(1), word_counts_1, word_counts_2, in_common)
header = [["index", "file_one", "file_two", "in_common"]]
table = header + rows

# https://docs.python.org/2/library/csv.html
with open("my_output_file.csv", 'w') as f:
     writer = csv.writer(f)
     writer.writerows(table)

网友

2楼 · 编辑于 2024-10-01 17:27:32

经过一段时间的搜索和一个更简单的解决方案，我偶然发现了一个代码，它给出了我想要的部分结果。每句话的字数。它由一个数字列表表示，看起来像这样：

    wordcounts = []
    with open('father_goriot.txt') as f:
       text = f.read()
       sentences = text.split('.')
       for sentence in sentences:
           words = sentence.split(' ')
           wordcounts.append(len(words))

但这个数字是不正确的，因为它也有更重要的意义。所以第一句话的结果是40个单词而不是38个单词。我该怎么解决这个问题呢。在

网友

3楼 · 编辑于 2024-10-01 17:27:32

您需要遍历文件并逐行读取，如下所示：

file = open('file.txt', 'r')

for line in file:
    do something with the line

相关问题更多 >

编程相关推荐

热门问题

热门文章

在python中如何计算多个句子中的单词

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >