如何在两个文本文件之间查找不相同的单词

2024-09-27 04:23:56 发布

您现在位置：Python中文网/ 问答频道 /正文

5737

网友

男 | 程序猿一只，喜欢编程写python代码。

我有两个文本文档，它们基本上都包含相同的单词，但也有一些例外。如何在document2中找到document1中没有的单词并将其打印出来？例如：

文件1： “你好，你好”

文件2： “嗨，你好吗，约翰”

期望输出： “嗨，今天约翰”

编辑：我想打印只在document2中出现但在document1中找不到的单词。我不想把它们之间相同的单词打印出来。在

我创建了这段代码，我认为它可以在两个文本文件之间找到匹配项，但这并不是我想要它做的：

doc1 = open("K:\System Files\Desktop\document1.txt", "r+")
doc2 = open("K:\System Files\Desktop\document2.txt", "r+")

list1 = []
list2 = []

for i in doc1: #Removes the new line after each word
    i = i[:-1]
    list1.append(i)
for i in doc2:
    i = i[:-1]
    list2.append(i)

for i in list1:
    for j in list2:
        if i == j:
            print(i)

Tags：文件 in txt for doc1 files open 单词

1条回答

网友

1楼 · 发布于 2024-09-27 04:23:56

如果您不担心单词的顺序，则可以使用集合来完成以下操作：

import re

def get_words(filename):
    with open(filename, 'r') as f_input:
        return set(w.lower() for w in re.findall(r'(\w+)', f_input.read()))

words1 = get_words('document1.txt')
words2 = get_words('document2.txt')

print words2 - words1

这将显示：

^{pr2}$

{1}在两个集合之间使用{1}的效果。在

如何在两个文本文件之间查找不相同的单词

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何在两个文本文件之间查找不相同的单词

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >