Python在CSV中合并具有单个重复字段的行的最有效方法是什么？

wp.xyz03.def02.01195.1,wp03.xyz03-c01_lc08_m00, wp.xyz03.def02.01195.1,wp02.xyz03, wp.xyz03.def02.01195.1,,wp01.def02 wp.xyz03.def02.01195.1,,wp02.def02-c02_lc14_m00 wp.atl21.lmn01.01193.2,wp03.atl21-c06_lc14_m00, wp.atl21.lmn01.01193.2,wp02.atl21, wp.atl21.lmn01.01193.2,,wp03.lmn01 tp.ghi03.ghi05.02194.65,,tp05.ghi05:1 tp.ghi03.ghi05.02194.65,tp05.ghi03:2, tp.ghi03.ghi05.02194.65,tp05.ghi03-c06_lc11_m00,

reader = csv.reader(open('parse_lur_luraz_clean_temp.csv', 'r'), delimiter=',') final = ['-','-','-'] parselur = ['-'] lur_a = "" lur_z = "" for row in reader: if row[0] != parselur[0]: final = ['-','-','-'] if row[1] != '': lur_a = row[1] if row[2] != '': lur_z = row[2] parselur[0] = row[0] elif row[0] == parselur[0]: if row[1] == '': lur_a = row[1] elif row[1] != '': lur_a = row[1] if row[2] == '': lur_z = row[2] elif row[2] != '': lur_z = row[2] parselur[0] = row[0] if parselur[0] != '' and parselur[0] not in final: final[0] = parselur[0] if lur_a != '': if final[1] == '-' or '_lc' not in final[1]: final[1] = lur_a lur_a = '' if lur_z != '': if final[2] == '-' or '_lc' not in final[2]: final[2] = lur_z lur_z = '' if len(final) == 3 and '-' not in final: fd = open('final_alu_nsn_temp.csv','a') writer = csv.writer(fd) writer.writerow((final)) fd.close() final = ['-','-','-'] else: parselur[0] = row[0]

2条回答

网友

1楼 · 编辑于 2024-05-19 00:20:49

如果我明白你想做什么，就给我一些伪代码：

Read each line:
Split by comma
Add each section to a large list

Next

Until list is empty:

Foreach value in the list:
Write value to file, then write a comma
Search a list, and remove duplicate values

好像是这样吗？我可以给你写一个python程序，如果这是你想要的

编辑：

我写了一个程序，据我所知，你给我的示例输入变成了示例输出

^{pr2}$

如果你有什么问题可以问

网友

2楼 · 编辑于 2024-05-19 00:20:49

现在是学习^{}的最佳时机：

import csv
from itertools import groupby

# assuming Python 2
with open("source.csv", "rb") as fp_in, open("final.csv", "wb") as fp_out:
    reader = csv.reader(fp_in)
    writer = csv.writer(fp_out)
    grouped = groupby(reader, lambda x: x[0])
    for key, group in grouped:
        rows = list(group)
        rows = [rows[0], rows[-1]]
        columns = zip(*(r[1:] for r in rows))
        use_values = [max(c) for c in columns]
        new_row = [key] + use_values
        writer.writerow(new_row)

生产

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章