读取许多csv文件并使用python将其写入utf8编码

2024-09-30 14:23:51 发布

您现在位置:Python中文网/ 问答频道 /正文

我使用python代码从许多csv文件中读取,并将encoding设置为utf8。当我读取文件时,我可以读取所有行,但当我写它时,它只能写一行。请帮我检查代码如下:

def convert_files(files, ascii, to="utf-8"):
for name in files:
#print ("Convert {0} from {1} to {2}").format(name, ascii, to)
    with open(name) as f:
        print(name)
        count = 0
        lineno = 0
        #this point I want to write the below text into my each new file at the first line           
        #file_source.write('id;nom;prenom;nom_pere;nom_mere;prenom_pere;prenom_mere;civilite (1=homme 2=f);date_naissance;arrondissement;adresse;ville;code_postal;pays;telephone;email;civilite_demandeur (1=homme 2=f);nom_demandeur;prenom_demandeur;qualite_demandeur;type_acte;nombre_actes\n')
        for line in f.readlines():
            lineno +=1
            if lineno == 1 :
                continue
            file_source = open(name, mode='w', encoding='utf-8', errors='ignore')
            #pass
            #print (line)
            # start write data to to new file with encode

            file_source.write(line)
            #file_source.close

#print unicode(line, "cp866").encode("utf-8")   
csv_files = find_csv_filenames('./csv', ".csv")
convert_files(csv_files, "cp866")  

Tags: csvto代码namesourcelinefilesnom
3条回答

如果您只需要更改文件的字符编码,那么这些文件是否为csv文件并不重要,除非转换可能会更改将哪些字符解释为分隔符、引号等:

def convert(filename, from_encoding, to_encoding):
    with open(filename, newline='', encoding=from_encoding) as file:
        data = file.read().encode(to_encoding)
    with open(filename, 'wb') as outfile:
         outfile.write(data)

for path in csv_files:
    convert(path, "cp866", "utf-8")

添加errors参数以更改如何处理编码/解码错误。在

如果文件可能很大,则可以增量转换数据:

^{pr2}$

每次迭代期间都会重新打开文件。在

for line in f.readlines():
        lineno +=1
        if lineno == 1 :
            continue
        #move the following line outside of the for block
        file_source = open(name, mode='w', encoding='utf-8', errors='ignore')

你能做到的

def convert_files(files, ascii, to="utf-8"):
    for name in files:
        with open(name, 'r+') as f:
            data = ''.join(f.readlines())
            data.decode(ascii).encode(to)
            f.seek(0)
            f.write(data)
            f.truncate()

相关问题 更多 >