Pandas |读取| csv |到| xml |值错误:标记名“foo bar”无效

2024-09-30 18:15:13 发布

您现在位置:Python中文网/ 问答频道 /正文

嗨,来自Stack Overflow的朋友freinds

我想使用python将csv转换为xml,我听说熊猫可以非常简单地管理这项任务

嗯,事实证明它并不是那么轻松

我的代码是什么样子的:

import pandas as pd
import chardet
from pandas.core.frame import DataFrame

csvFile = '172431-82056.csv'
xmlFile = 'mySecondData.xml'  

def check_encoding(filename):
    """
    input: filename = "filename.csv"
    output: Dictionary = {'encoding': 'UTF-16', 'confidence': 1.0, 'language': ''}
    """
    result= {}
    with open(filename, 'rb') as rawdata:
        result = chardet.detect(rawdata.read(10000))
    return result

def import_csv(filename):
    """
    input: filename = "filename.csv"
    output: Dictionary = {'csv key': 'csv data', ... }
    """
    encoding = check_encoding(filename)['encoding']
    csv_data = pd.read_csv(filename, engine ='python', encoding=encoding, sep = None)
    #print(csv_data)
    return csv_data

#print(import_csv(csvFile))

def convert_to_xml(input_file, output_file):
    csv_data = import_csv(input_file)
    csv_data.to_xml(path_or_buffer=output_file, index = True, root_name='products',row_name='item', elem_cols=['post_title','regular_price'], prefix = 'g:', pretty_print=True)

convert_to_xml(csvFile, xmlFile)

我的输出是什么样子的:

Traceback (most recent call last):
  File "c:\Users\PavelH\Documents\Git\CSV Converter\csv_converter.py", line 53, in <module>
    convert_to_xml(csvFile, xmlFile)
  File "c:\Users\PavelH\Documents\Git\CSV Converter\csv_converter.py", line 51, in convert_to_xml
    df.to_xml(path_or_buffer=output_file, index = True, root_name='products',row_name='item', prefix = 'g:', pretty_print=True)
  File "C:\Users\PavelH\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\frame.py", line 2986, in to_xml
    return xml_formatter.write_output()
  File "C:\Users\PavelH\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\formats\xml.py", line 265, in write_output
    xml_doc = self.build_tree()
  File "C:\Users\PavelH\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\formats\xml.py", line 485, in build_tree
    self.build_elems()
  File "C:\Users\PavelH\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\formats\xml.py", line 575, in build_elems
    SubElement(self.elem_row, elem_name).text = val
  File "src\lxml\etree.pyx", line 3136, in lxml.etree.SubElement
  File "src\lxml\apihelpers.pxi", line 179, in lxml.etree._makeSubElement
  File "src\lxml\apihelpers.pxi", line 1734, in lxml.etree._tagValidOrRaise
ValueError: Invalid tag name 'foo bar'

带有空格的标记是否无效


Tags: csvtonameinimportpandasoutputdata
2条回答

我认为你的熊猫已经过时了to_xml已在1.3.0版中引入。 您可以使用检查您的版本

# in python shell
import pandas
print(pandas.__version__)

如果这是一个比1.3.0旧的版本,您应该升级熊猫

# in bash shell
pip install  upgrade pandas

命令pip install upgrade pandas解决了这个问题

相关问题 更多 >