将csv文件除以列值

2024-10-05 10:00:47 发布

您现在位置:Python中文网/ 问答频道 /正文

我有超过200个文件,我想除以列clName值,并在所有文件中保留头。我还想用原始文件名保存这些文件-文件名.txt在

ID  Plate   Well      ctr        clID     clName
21    5      C03        1       50012       COL
21    5      C03        1       50012       COL
21    5      C03        1       50012       COL 
21    5      C04        1       50012       IA 
21    5      C04        1       50012       IA 
21    5      C05        1       50012       ABC 


import csv
from itertools import groupby

for key, rows in groupby(csv.reader(open("file.csv")),
                         lambda row: row[7]):
    with open("%s.txt" % key, "w") as output:
        for row in rows:
            output.write(",".join(row) + "\n")

我遇到的问题是,该列不总是被称为clName,它可以被称为clName、cll\n、c\u Name。有时是第7列,有时是第5列或第9列。在

我所知道的是按列值分隔文件,但不保留头,我必须检查每个文件,以确定其列5、7、9等

有没有一种方法可以让我从列表中检查列名,当找到其中一个名称时,按该列值拆分文件?在

示例数据 https://drive.google.com/file/d/0Bzv1SNKM1p4uell3UVlQb0U3ZGM/view?usp=sharing

谢谢你


Tags: 文件csvkeyimporttxtfor文件名col
1条回答
网友
1楼 · 发布于 2024-10-05 10:00:47

请改用csv.DictReader和{}。这是一个大纲,应该能给你指明正确的方向。在

special_col = ['cll_n', 'clName']

with open('myfile.csv', 'r') as fh:
    rdr = csv.DictReader(fh)

    # now we need to figure out which column is used
    for c in special_col:
        if c in rdr.fieldnames:
            break  # found the column name
    else:
        raise IOError('No special column in file')

    # now execute your existing code, but group by the
    # column using lambda row: row[c] instead of row 7
    call_existing_code(rdr, c)


def call_existing_code(rdr, c):
    # set up an output file using csv.DictWriter; you can
    # replace the original column with the new column, and
    # control the order of fields

    with open('output.csv', 'w') as fh:
        wtr = csv.DictWriter(fh, fieldnames=['list', 'of', 'fields'])
        wtr.writeheader()

        for row in groupby(rdr, lambda r: r[c]):

            # [process the row as needed here]

            wtr.writerow(row)

相关问题 更多 >

    热门问题