对于unicode代码点，python xml.etree.ElementTree tostring（）fromstring（）往返失败

2024-09-27 20:19:24 发布

您现在位置：Python中文网/ 问答频道 /正文

793

网友

男 | 程序猿一只，喜欢编程写python代码。

我在Python2.7中使用xml.etree.ElementTree，往返字符串时遇到问题。如果树中存在非ascii Unicode字符，则对ET.tostring()调用ET.fromstring()失败。

为什么这不管用？既然ElementTree想要bytestreams并进行自己的解码，那么为什么它默认为ASCII解析器？这是由我忽略的东西决定的吗，比如python文件的编码或语言环境？

仅限ASCII字符：

import xml.etree.ElementTree as ET

t1 = ET.Element('test')
t1.text = u'hello world'
t1_roundtrip = ET.fromstring(ET.tostring(t1, encoding='utf8', method='xml'))
# ET.dump(t1) == ET.dump(t1_roundtrip)

Unicode代码点失败：

import xml.etree.ElementTree as ET

t2 = ET.Element('test')
t2.text = u'\u2603'
t2_roundtrip = ET.fromstring(ET.tostring(t2, encoding='utf8', method='xml'))

>>> t2_roundtrip = ET.fromstring(ET.tostring(t2, encoding='utf8', method='xml'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/rh/python27/root/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1300, in XML
    parser.feed(text)
  File "/opt/rh/python27/root/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1642, in feed
    self._raiseerror(v)
  File "/opt/rh/python27/root/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1506, in _raiseerror
    raise err
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 2, column 6

Tags： text in line xml utf8 encoding et file

0条回答

目前没有回答

对于unicode代码点，python xml.etree.ElementTree tostring（）fromstring（）往返失败

相关问题更多 >

编程相关推荐

热门问题

热门文章

对于unicode代码点，python xml.etree.ElementTree tostring（）fromstring（）往返失败

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >