提取各种信息

2024-10-02 18:16:07 发布

您现在位置:Python中文网/ 问答频道 /正文

概述

在写入另一个csv文件之前,希望从一个2列csv文件中提取各种信息,如名称、日期和地址

条件

  1. 按第一行提取名称,因为它始终是第一行 行。你知道吗
  2. 按regex提取日期(python中有regex吗?)##/##/#### 格式
  3. 通过常量关键字“road”提取地址

从EXCEL查看的CSV虚拟源数据参考文件格式示例


       ID,DATA
     88888,DADDY            
     88888,2/06/2016        
     88888,new issac road        
     99999,MUMMY            
     99999,samsung road   
     99999,12/02/2016      

期望的CSV结果

ID,Name,Address,DATE
8888,DADDY,new issac road,2/06/2016 
9999,MUMMY,samsung road,12/02/2016

我目前掌握的情况:

import csv
from collections import defaultdict

columns = defaultdict(list) # each value in each column is appended to a list

with open('dummy_data.csv') as f:
    reader = csv.DictReader(f) # read rows into a dictionary format
    for row in reader: # read a row as {column1: value1, column2: value2,...}
        for (k,v) in row.items(): # go over each column name and value 
            columns[k].append(v) # append the value into the appropriate list
                                 # based on column name k
uniqueidstatement = columns['receipt_id']

print uniqueidstatement

resultFile = open("wtf.csv",'wb')
wr = csv.writer(resultFile, dialect='excel')
wr.writerow(uniqueidstatement)

Tags: columns文件csvin名称value地址column
1条回答
网友
1楼 · 发布于 2024-10-02 18:16:07

您可以按ID对这些节进行分组,并通过一些简单的逻辑从每个组中确定哪个是日期,哪个是地址。你知道吗

import csv
from itertools import groupby
from operator import itemgetter

with open("test.csv") as f, open("out.csv", "w") as out:
    reader = csv.reader(f)
    next(reader)
    writer = csv.writer(out)
    writer.writerow(["ID","NAME","ADDRESS", "DATE"])
    groups = groupby(csv.reader(f), key=itemgetter(0))
    for k, v in groups:
        id_, name = next(v)
        add_date_1, add_date_2 = next(v)[1], next(v)[1]
        date, add = (add_date_1, add_date_2) if "road" in add_date_2 else  (add_date_2, add_date_1)
        writer.writerow([id_, name, add, date])

相关问题 更多 >