如何读入Python的特殊字符

<pub> <ID>75</ID> <title>Use of Lexicon Density in Evaluating Word Recognizers</title> <year>2000</year> <booktitle>Multiple Classifier Systems</booktitle> <pages>310-319</pages> <authors> <author>Petr Slavík</author> <author>Venu Govindaraju</author> </authors> </pub>

2条回答

网友

1楼 · 编辑于 2024-06-28 15:41:14

如果您使用的是python3.x，只需导入html，您可以先对提取的数据进行解码

在html.unescape（秒）

将字符串s中的所有命名和数字字符引用（例如，>；、>；、&x3e；）转换为相应的unicode字符。在

>>import html
>>print(html.unescape("Petr Slav&iacute;k"))

Petr Slavík

Seems the html-safe character cannot be parsed and returned as Document object by minidom, you have to read the file and decode it, then send as a string to the module, as the following code.

在xml.dom.minidom.parseString（字符串[，解析器]）

返回表示字符串的文档。在

^{pr2}$

网友

2楼 · 编辑于 2024-06-28 15:41:14

.encode('UTF-8') #Add to your code at the end of the example

UTF-8支持以下大多数字符：，应该有用，添加：

^{pr2}$

在html.unescape（秒）

在xml.dom.minidom.parseString（字符串[，解析器]）

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何读入Python的特殊字符

在html.unescape（秒）

在xml.dom.minidom.parseString（字符串[，解析器]）

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >