Python CSV wri

2024-06-26 15:02:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个csv,看起来像这样:

HA-MASTER,CategoryID
38231-S04-A00,14
39790-S10-A03,14
38231-S04-A00,15
39790-S10-A03,15
38231-S04-A00,16
39790-S10-A03,16
38231-S04-A00,17
39790-S10-A03,17
38231-S04-A00,18
39790-S10-A03,18
38231-S04-A00,19
39795-ST7-000,75
57019-SN7-000,75
38251-SV4-911,75
57119-SN7-003,75
57017-SV4-A02,75
39795-ST7-000,76
57019-SN7-000,76
38251-SV4-911,76
57119-SN7-003,76
57017-SV4-A02,76

我想做的是重新格式化此数据,使每个categoryID只有一行,例如:

^{pr2}$

我在excel中还没有找到一种方法可以通过编程实现这一点。我有超过10万条线路。有没有一种方法可以使用python CSV读写来完成这样的操作?在


Tags: csv数据方法masterhas10categoryida02
3条回答

我将读取整个文件,创建一个字典,其中键是ID,值是其他数据的列表。在

data = {}
with open("test.csv", "r") as f:
    for line in f:
        temp = line.rstrip().split(',')
        if len(temp[0].split('-')) == 3:  # => specific format that ignores the header...
            if temp[1] in data:
                data[temp[1]].append(temp[0])
            else:
                data[temp[1]] = [temp[0]]

with open("output.csv", "w+") as f:
    for id, datum in data.iteritems():
        f.write("{},{}\n".format(id, ','.join(datum)))

是的,有一种方法:

import csv

def addRowToDict(row):
    global myDict
    key=row[1]
    if key in myDict.keys():
        #append values if entry already exists
        myDict[key].append(row[0])
    else:
        #create entry
        myDict[key]=[row[1],row[0]]


global myDict
myDict=dict()
inFile='C:/Users/xxx/Desktop/pythons/test.csv'
outFile='C:/Users/xxx/Desktop/pythons/testOut.csv'

with open(inFile, 'r') as f:
    reader = csv.reader(f)
    ignore=True
    for row in reader:
        if ignore:
            #ignore first row
            ignore=False
        else:
            #add entry to dict
            addRowToDict(row)


with open(outFile,'w') as f:
    writer = csv.writer(f)
    #write everything to file
    writer.writerows(myDict.itervalues())

只需编辑填充和输出文件

使用列表字典(Python2.7解决方案)很简单:

#!/usr/bin/env python
import fileinput

categories={}
for line in fileinput.input():
    # Skip the first line in the file (assuming it is a header).
    if fileinput.isfirstline():
        continue

    # Split the input line into two fields.   
    ha_master, cat_id = line.strip().split(',')

    # If the given category id is NOT already in the dictionary
    # add a new empty list
    if not cat_id in categories:
        categories[cat_id]=[]

    # Append a new value to the category.
    categories[cat_id].append(ha_master)

# Iterate over all category IDs and lists.  Use ','.join() to
# to output a comma separate list from an Python list.
for k,v in categories.iteritems():
    print '%s,%s' %(k,','.join(v))

相关问题 更多 >