使用python或sh比较两个文件时如何获取完整行

###Total.txt column a column b column c interaction1 mitochondria_205000_225000 mitochondria_195000_215000 interaction2 mitochondria_345000_365000 mitochondria_335000_355000 interaction3 mitochondria_345000_365000 mitochondria_5000_25000 interaction4 chloroplast_115000_128207 chloroplast_35000_55000 interaction5 chloroplast_115000_128207 chloroplast_15000_35000 interaction15 2_10515000_10535000 2_10505000_10525000 ###Unique.txt column a column b mitochondria_205000_225000 mitochondria_195000_215000 mitochondria_345000_365000 mitochondria_335000_355000 mitochondria_345000_365000 mitochondria_5000_25000 chloroplast_115000_128207 chloroplast_35000_55000 chloroplast_115000_128207 chloroplast_15000_35000 mitochondria_185000_205000 mitochondria_25000_45000 2_16595000_16615000 2_16585000_16605000 4_2785000_2805000 4_2775000_2795000 4_11395000_11415000 4_11385000_11405000 4_2875000_2895000 4_2865000_2885000 4_13745000_13765000 4_13735000_13755000

2条回答

网友

1楼 · 编辑于 2024-09-29 19:31:19

这是我的python脚本

enter code here`file=open('total.txt')

file2 = open('unique.txt')
all_content=file.readlines()
all_content2=file2.readlines()
store_id_lines = []
ff = open('match.dat', 'w')

for i in range(len(all_content)):
              line=all_content[i].split('\t')
              seq=line[1]+'\t'+line[2]
              for j in range(len(all_content2)):
                     if all_content2[j]==seq:
                           ff.write(seq)
                           break

但它不提供期望输出（满足if条件的第1列的值）。我觉得好像唯一.txt==第i个总计.txt 然后写下第i行总计.txt导入新文件

网友

2楼 · 编辑于 2024-09-29 19:31:19

这应该能奏效。你知道吗

import csv
total = "C:\\...total.txt" #set path to your file!
unique = "C:\\...unique.txt"
newfile = "C:\\...match.csv"

a = []
b = []
towrite = []

with open(total, "r") as rcursor1: #read the document
    for trow in rcursor1: #read each row
        row1 = trow.split("\t") #split it by your seperator
        a.append(row1[1:]) #we are only interested in everything from column b onwards


with open(unique, "r") as rcursor2:
    for urow in rcursor2:
        row2 = urow.split("\t")
        b.append(row2)


print "This is a", a
print len(a)
print "This is b", b
print len(b)

a1 = set(map(tuple, a)) #lists are hashable, but we need unhasable object to work with set
b1 = set(map(tuple, b)) #that why change list to tuples, tuples are not hashable

matches = set(a1).intersection(b1) #find the matches, best is to take shorter list as first argument for better perfomance!
print "Our matches, unsorted!", matches

with open(newfile, 'wb') as wcursor: #write to file
    for i in matches:
        c = list(i)
        d = ",".join(c)
        print d
        wcursor.write(str(d)+"\n")

相关问题更多 >

编程相关推荐

热门问题

热门文章