基于列值从单个文件生成多个文件

2024-09-30 22:18:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我刚刚开始学习python,尝试将它用于我的一项手动活动,我使用excel filter operator执行该活动

每个月我收到一个文件,我将该csv放入excel,然后应用过滤器在carrier字段中为值创建一个新文件,并与相应的carrier共享

以下是我的csv中的一些示例数据。我在这里只展示了2个载体,但我有13个以上的值

carrier,type,count
DTH,a,123
DTH,b,3123
DTH,c,41341
DTH,d,13411
BLUEDART,a,12123
BLUEDART,b,31231
BLUEDART,c,411
BLUEDART,d,11

预期产量

DTH.csv

carrier,type,count
DTH,a,123
DTH,b,3123
DTH,c,41341
DTH,d,13411

BLUEDART.csv

carrier,type,count
BLUEDART,a,12123
BLUEDART,b,31231
BLUEDART,c,411
BLUEDART,d,11

我们非常感谢您的任何帮助或指导


Tags: 文件csv数据过滤器示例typecount手动
2条回答

仅使用标准Python库:

import csv

def write_output(header_row, carrier_name, c_rows):
    print("writing output for "+carrier_name)
    with open("c:\\tmp\\"+carrier_name+".csv", "w", newline="") as outfile:
                outwriter = csv.writer(outfile, delimiter=",")
                outwriter.writerow(header_row)
                for outrow in c_rows:
                    outwriter.writerow(outrow)

with open("c:\\tmp\\carrier.csv", newline="") as csvfile:
    creader = csv.reader(csvfile, delimiter=",")

    first_row = True
    header_row = None
    groups = {}

    for row in creader:
        if first_row:
            header_row = row
            first_row = False
        else:
            if not row[0] in groups:
                groups[row[0]] = [row]
            else:
                groups[row[0]].append(row)

    for gr in groups:
        write_output(header_row, gr, groups[gr])

非常容易使用熊猫:

import pandas as pd

carriers_csv_path = r"C:\Users\Bluetab\PycharmProjects\utils\csvGeneratorStack\csvCarriers.csv"
carrier_df = pd.read_csv(carriers_csv_path)
grouped_by_carrier = carrier_df.groupby(["carrier"])
unique_keys = carrier_df['carrier'].unique()


for unique_key in unique_keys:
    grouped_by_carrier.get_group(unique_key).to_csv("./" + unique_key + ".csv", sep=",", index=False)

希望能有帮助

托马斯

相关问题 更多 >