将空格插入到输出文件指针(ofp)中

2024-10-06 12:31:15 发布

您现在位置:Python中文网/ 问答频道 /正文

这个脚本的目的是获取一个传入的csv文件,用DictReader读取, 获取读取的键,查看它们是否与fieldMap字典中预先指定的值匹配,如果匹配,则将这些键附加到我的hdrlist。然后,将头列表写入输出的文件调用ofp。你知道吗

我遇到的问题是,当我没有一个键与fieldMap中预先指定的值匹配时,我需要插入一个空格(“”)。你知道吗

我曾尝试在else语句中将空值附加到hdrlist,并在fieldMap字典中使用空键值对:

if row.has_key(ft_test):
    hdrlist.append(ft_test)
else:
    hdrlist.append('')


'':[''] #blank key:value pair

,但是我的:

if hdrlen != len(hdrlist)-1:
    print "Cannot Cannot find a key for %s in file %s" % (ft,fn)"

错误处理语句返回的print语句比我认为的要多,我不知道为什么。你知道吗

如果有人能告诉我如何在我的ofp.写入(fmtstring),不胜感激。你知道吗

另外,如果有人能告诉我为什么我得到比我认为我应该与上述其他声明更多的打印声明,它将不胜感激。你知道吗

我的整个脚本如下,如果有任何其他信息需要帮助这个代码我,我会很乐意提供它。你知道吗

下面是一个输入文件的示例,它将生成多个print语句。你知道吗

输入_文件.csv={'cust\u no':1,'streetaddr':'2103 Union Ave','address2':'','city':'Chicago'}

#!/usr/bin/env python
import sys, csv, glob

fieldMap = {'zipcode':['Zip5', 'zip9','zipcode','ZIP','zip_code','zip','ZIPCODE'],
        'firstname':['firstname','FIRSTNAME'],
        'lastname':['lastname','LASTNAME'],
        'cust_no':['cust_no','CUST_NO'],
        'user_name':['user_name','USER_NAME'],
        'status':['status','STATUS'],
        'cancel_date':['cancel_date','CANCEL_DATE'],
        'reject_date':['REJECT_DATE','reject_date'],
        'streetaddr':['streetaddr','STREETADDR','ADDRESS','address'],
        'streetno':['streetno','STREETNO'],
        'streetnm':['streetnm','STREETNM'],
        'suffix':['suffix','SUFFIX'], #suffix of street name: dr, ave, st
        'city':['city','CITY'],
        'state':['state','STATE'],
        'phone_home':['phone_home','PHONE_HOME'],
        'email':['email','EMAIL'],
        '':['']
        }


def readFile(fn,ofp):
    count = 0
    CSVreader = csv.DictReader(open(fn,'rb'), dialect='excel', delimiter=',')
    for row in CSVreader:
        count+= 1
        if count == 1:
            hdrlist = []
            for ft in fieldMap.keys():
                hdrlen = len(hdrlist)
                for ft_test in fieldMap[ft]:
                    if row.has_key(ft_test):
                        hdrlist.append(ft_test)
                if hdrlen != len(hdrlist)-1:
                    print "Cannot find a key for %s in file %s" % (ft,fn)


        if len(hdrlist) != 16:
            print "Another error. Not all header's have been assigned new values."
        if count < 5:
            x=len(hdrlist)
            fmtstring = "%s\t" * len(hdrlist) % tuple(row[x] for x in hdrlist)
            ofp.write(fmtstring)
            break

if __name__ == '__main__':

    filenames = glob.glob(sys.argv[1])
    ofp = sys.stdout
    ofp.write("zipcode\tfirstname\tlastname\tcust_no\tuser_name\tstatus\t"
              "cancel_date\treject_date\tstreetaddr\tstreetno\tstreetnm\t"
              "suffix\tcity\tstate\tphone_home\temail")

    for filename in filenames:
        readFile(filename,ofp)

样本数据:

cust_no,status,streetaddr,address2,city,state,zipcode,billaddr,servaddr,title,latitude,longitude,custsize,telemarket,dirmail,nocredhold,email,phone_home,phone_work,phone_fax,phone_page,phone_cell,phone_othr,taxrate1,taxrate2,taxrate3,taxtot,company,firstname,lastname,user_name,dpbc,container,seq,paytype_am,paytype_di,paytype_mc,paytype_vi
0,0,'123 fake st.',,'chicago','il',60185,'123 billaddr st.','123 servaddr st.','mr.',43.123,54.234 ,2000,'TRUE','TRUE','TRUE','email@email.com',(666)555-6666,,,,,,,,,,,'bob','smith','bob smith',,,,'TRUE','TRUE','TRUE','TRUE'
0,0,'123 fake st.','','chicago','il',60185,'123 billaddr st.','123 servaddr st.','mr.',43.123,54.234 ,2000,'TRUE','TRUE','TRUE','email@email.com',(666)555-6666,'','','','','','','','','','','bob','smith','bob smith','','','','TRUE','TRUE','TRUE','TRUE'

Tags: keynameintruefordatelenif
1条回答
网友
1楼 · 发布于 2024-10-06 12:31:15

如果您只需要正在处理的csv文件中已识别字段名的hdrlist,那么可以在创建DictReader之后立即将DictReader.fieldnames属性中的值与fieldMap的内容进行比较来创建它,因为使用filenames参数执行此操作将自动读取文件的头行。你知道吗

我还将您的fieldMap字典更改为OrderedDict,这样它就可以保留键的顺序。你知道吗

import glob
from collections import OrderedDict
import csv
import sys

fieldMap = OrderedDict([
    ('zipcode', ['zipcode', 'ZIPCODE', 'Zip5', 'zip9', 'ZIP', 'zip_code', 'zip']),
    ('firstname', ['firstname', 'FIRSTNAME']),
    ('lastname', ['lastname', 'LASTNAME']),
    ('cust_no', ['cust_no', 'CUST_NO']),
    ('user_name', ['user_name', 'USER_NAME']),
    ('status', ['status', 'STATUS']),
    ('cancel_date', ['cancel_date', 'CANCEL_DATE']),
    ('reject_date', ['reject_date', 'REJECT_DATE']),
    ('streetaddr', ['streetaddr', 'STREETADDR', 'ADDRESS', 'address']),
    ('streetno', ['streetno', 'STREETNO']),
    ('streetnm', ['streetnm', 'STREETNM']),
    ('suffix', ['suffix', 'SUFFIX']),  # suffix of street name: dr, ave, st
    ('city', ['city', 'CITY']),
    ('state', ['state', 'STATE']),
    ('phone_home', ['phone_home',' PHONE_HOME']),
    ('email', ['email', 'EMAIL']),
])

def readFile(fn,ofp):
    with open(fn, 'rb') as csvfile:
        # the following reads the header line into csvReader.fieldnames
        csvReader = csv.DictReader(csvfile, dialect='excel', delimiter=',')
        # create a list of recognized fieldnames in the csv file
        hdrlist = []
        for ft in fieldMap:
            for ft_test in fieldMap[ft]:
                if ft_test in csvReader.fieldnames:
                    hdrlist.append(ft_test)
                    break
            else:
                hdrlist.append(None)  # placeholder (could  also be '')
        hdrlen = len(hdrlist)
        ofp.write('hdrlist: {}\n'.format(hdrlist))
        if hdrlen != len(fieldMap):
            print "Note that not all field names were present in file."

        ofp.write("\t".join(fieldMap) + '\n')
        for row in csvReader:
            fmtstring = "%s\t" * hdrlen % tuple(
                row[field] if field else 'NA' for field in hdrlist)
            ofp.write(fmtstring+'\n')

if __name__ == '__main__':
#    sys.argv = [sys.argv[0], 'ofp_input.csv']  # hardcode for testing
    if len(sys.argv) != 2:
        print "Error:  Filename argument missing!"
        sys.exit(-1)
    filenames = glob.glob(sys.argv[1])
    ofp = sys.stdout
    for filename in filenames:
        readFile(filename, ofp)

相关问题 更多 >