Python:比较2个CSV,如果某个单元格匹配,就把数据拉过来

2024-10-02 12:32:28 发布

您现在位置:Python中文网/ 问答频道 /正文

文件A:

id,desc,name
12345,blah blah blah,jsmith
6789,yada yada yada,ckast
54321,yum yum yum,jpetersen

文件B:

key,id
AB-873,6789
CF-395,54321
HG-713,12345

我想做的是查看文件A中的每一行,看看id列是否与文件B中的id列匹配,以及它是否通过“name”单元格复制到文件B中。因此,文件B最终将如下所示:

AB-873,6789,ckast
CF-395,54321,jpetersen
HG-713,12345,jsmith

我知道“csv”Python模块可以读取单独的行,但是我不知道从哪里开始。谢谢!你知道吗


Tags: 文件csvkeynameidabhgdesc
2条回答

使用csv,可以执行以下操作:

import csv

with open(fn1) as fa, open(fn2) as fb:
    r1, r2=map(csv.reader, (fa, fb))
    a_header, b_header=map(next, (r1, r2))
    data_a, data_b=map(lambda header: {k:list() for k in header}, 
                           (a_header, b_header))
    for line in r1:
        for k, v in zip(a_header, line):
            data_a[k].append(v)
    for line in r2:
        for k, v in zip(b_header, line):
            data_b[k].append(v)

b_header+=['name']        
data_b['name']=[]        
for e in data_b['id']:
    try:
        v=data_a['name'][data_a['id'].index(e)]
    except ValueError:
        v=None    
    data_b['name'].append(v)    

with open(fn3, 'w') as fout:
    writer=csv.writer(fout)  
    writer.writerow([e for e in b_header])  
    idx=0
    while True:
        try:
            writer.writerow([data_b[key][idx] for key in b_header])
            idx+=1
        except IndexError:
            break

如果您想要一个简单的代码,此代码适用于您:

a_lines = open('FileA', 'r').readlines()[1:]
b_lines = open('FileB', 'r').readlines()[1:]
file_result = open('result', 'w')

# Read content of FileA to a table (list of lists)
a_table = []
for l in a_lines:
    a_table.append([w.strip() for w in l.split(',')])

# Read content of FileB in a dictionary.
# The 'id' field as dictionary key for simple look-up.
b_dict = {}
for l in b_lines:
    words = l.split(',')
    b_dict[words[1].strip()] = words[0].strip()

# Do the actual work and save result.
for row in a_table:
    if row[0] in b_dict:
        file_result.write(b_dict[row[0]] + ',' + row[0] + ',')
        file_result.write(row[2] + '\n')

我用你的样品测试了一下。你知道吗

相关问题 更多 >

    热门问题