Python比较lin中分隔字符的文件

EgrG_000095700 /product="ubiquitin carboxyl terminal hydrolase 5" EgrG_000095800 /product="DNA polymerase epsilon subunit 3" EgrG_000095850 /product="crossover junction endonuclease EME1" EgrG_000095900 /product="lysine specific histone demethylase 1A" EgrG_000096000 /product="charged multivesicular body protein 6" EgrG_000096100 /product="NADH ubiquinone oxidoreductase subunit 10"

f1 = open('egranulosus_v3_2014_05_27.tsv').readlines() f2 = open('eg_es_final_ids').readlines() fr = open('res.tsv','w') for line in f1: if line[0:14] == f2[0:14]: fr.write('%s'%(line)) fr.close() print "Done!"

3条回答

网友

1楼 · 编辑于 2024-10-02 20:42:24

f2是文件2中的行列表。在哪里迭代列表，就像在文件1（f1）中对行所做的那样。这似乎就是问题所在。你知道吗

网友

2楼 · 编辑于 2024-10-02 20:42:24

with open('egranulosus_v3_2014_05_27.txt', 'r') as infile:
    line_storage = {}
    for line in infile:
        data = line.split()
        key = data[0]
        value = line.replace('\n', '')
        line_storage[key] = value

with open('eg_es_final_ids.txt', 'r') as infile, open('my_output.txt', 'w') as outfile:
    for line in infile:
        lookup_key = line.split('.')[0]
        match = line_storage.get(lookup_key)
        outfile.write(''.join([str(match), '\n']))

网友

3楼 · 编辑于 2024-10-02 20:42:24

我将把来自f2的id存储在一个集合中，然后检查f1。你知道吗

id_set = set()
with open('eg_es_final_ids') as f2:
    for line in f2:
        id_set.add(line[:-2]) #get rid of the .1

with open('egranulosus_v3_2014_05_27.tsv') as f1:
    with open('res.tsv', 'w') as fr:
        for line in f1:
            if line[:14] in id_set:
                fr.write(line)

相关问题更多 >

编程相关推荐

热门问题

热门文章