转储CSV文件中包含一系列空白字段的行

2024-09-29 19:28:35 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图写一个python程序来清理来自CSV文件的调查数据。 我想转储包含一系列空白字段的行,如下面示例中的第一行和第三行。你知道吗

"1","a","b","c",,,,,
"2","a","b","c","d","e","f",,"h"
"3","a","b","c",,,,,
"4","a","z","u","d","i","f","x","h"
"5","d","c","c",,"c","f","g","z"

遵循我不成功的代码:

import csv

fname = raw_input("Enter input file name: ")
if len(fname) < 1 : fname = "survey.csv"

foutput = raw_input("Enter output file name: ")
if len(foutput) < 1 : foutput = "output_"+fname


input = open(fname, 'rb')
output = open(foutput, 'wb')


searchFor = 5*['']

writer = csv.writer(output)

for row in csv.reader(input):
    if searchFor not in row :
        writer.writerow(row)

input.close()
output.close()

Tags: csvnameinputoutputrawlenifopen
2条回答

使用counter检查一个列表是否是另一个列表的子集,如下所示。如果要删除空元素,则只需使用Noneboollen来过滤空白并丢弃它们-

import csv
from itertools import repeat
from collections import Counter
input = open(fname, 'rb')
output = open(foutput, 'wb')

writer = csv.writer(output)
#Helper function
def counterSubset(list1, list2):
    c1, c2 = Counter(list1), Counter(list2)
    for k, n in c1.items():
        if n > c2[k]:
            return False
    return True
for row in csv.reader(input):
    if not counterSubset(list(repeat('',5)),row):# i used 5 for five '' you can change it
        writer.writerow(row)#use filter(None,row) or filter(bool,row) or filter(len,row) to remove empty elements
input.close()
output.close()

输出-

1,a,b,c,,
2,a,b,c,d,e,f,g,h
4,a,,z,u,d,i,f,x,h
5,d,c,c,d,c,f,g,z

怎么样

# change this to whatever a blank item is from the csv reader
# probably "" or None
blank_item = None

for row in csv.reader(input):
    # filter out all blank elements
    blanks = [x for x in row if x == blank_item]
    if len(blanks) < 5:
        writer.writerow(row)

这将计算一行中的空格数,并允许您根据需要删除它们。你知道吗

相关问题 更多 >

    热门问题