python中两个文件的比较

2024-10-04 07:24:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个文本文件,包含如下数据

我希望这是在hadoop中完成的。有人能给我指路吗? 文本文件1-->;1戈尔格·海德 2 ganesh新加坡

textfile2 --> 1 goergy hydel
              2 ganest singapore

它必须按列和字符进行比较,所以在比较之后,它应该给出报告

column_name source destiny mismatch
      xxx    george georgy y
             ganesh ganest h
             hyder  hydel  r

请帮帮我


Tags: 数据namegthadoop报告column字符文本文件
3条回答

正如上面提到的Seer.The,您可以使用difflib

import difflib

# Read the files
f = open('textfile1.txt', 'r').readlines()
list1 = []
for n in f:
    text = n.rstrip().split(" ")
    list1.append(text)


f = open('textfile2.txt', 'r').readlines()
list2 = []
for n in f:
    text = n.rstrip().split(" ")
    list2.append(text)

# Get the output
for ii in range(len(list1)):
    for jj in range(len(list1[0])):
        output_list = [li[-1] 
                       for li in list(difflib.ndiff(list1[ii][jj], list2[ii][jj]))
                       if "-" in li]
        if output_list == []:
            output_list = ["no difference"]
        print "{} {} {}".format(list1[ii][jj], list2[ii][jj], output_list[0])

输出应如下所示:

goerge goergy e
hyder hydel r
ganesh ganest h
singapore singapore no difference
with open(textfile1,"r") as f1:
    with open(textfile2,"r") as f2:

        words1 = f1.read().split(" ")
        words2 = f2.read().split(" ")


        #considering f1 and f2 have the same number of words
        for i in range(len(words1)):

            if words1[i] != words2[i]:

                for j in range(len(words1[i])):

                    if words1[i][j] != words2[i][j]:

                        print(words1[i],words2[i],words2[i][j])
f = open('textfile1.txt', 'a').readlines()
for n in f:
    text1 = n.rstrip()
n = open('textfile2.txt', 'a').readlines()
for l in n:
    text2 = l.rstrip()
if text1 == text2:
   print("It Is the Same Thing")
   report = open('report.txt')
   report.write('It is The Same Thing with the text 1 and 2')
   report.write('\n')
else:
   print("it Is Not The Same Thing")
   report = open('report.txt')
   report.write('It is Not The Same Thign With the text 1 and 2')
   report.write('\n')

相关问题 更多 >