我有一个文件example.log
,其中包含:
<POOR_IN200901UV xmlns="urn:hl7-org:v3"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ITSVersion="XML_1.0"
xsi:schemaLocation="urn:hl7-org:v3
../../Schemas/POOR_IN200901UV20.xsd">\n\t<!-- \xe6\xb6\x88\xe6\x81\xafID -
->\n\t<id extension="BS002"/>
我想读取文件并将str转换为utf-8
编码格式,然后写入新文件。目前我的代码如下:
with open("example_decoded.log", 'w') as f:
for line in open("example.log", 'r', encoding='utf-8'):
m = re.search("<POOR_IN200901UV", line)
if m:
line = line[m.start():-2]
line_bytes = bytes(line, encoding='raw_unicode_escape')
line_decoded = line_bytes.decode('utf-8')
print(line_decoded)
f.write(line_decoded)
else:
pass
但是example_decoded.log
的内容:
<POOR_IN200901UV xmlns="urn:hl7-org:v3"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ITSVersion="XML_1.0"
xsi:schemaLocation="urn:hl7-org:v3
../../Schemas/POOR_IN200901UV20.xsd">\n\t<!-- \xe6\xb6\x88\xe6\x81\xafID -
->\n\t<id extension="BS002"
{
请参阅:Read hex characters and convert them to utf-8 using python 3
解决办法是:
虽然我不明白为什么
encode('latin-1')
首先,有人能解释一下吗
请参阅下面的链接,以添加您的endian和类型,而不是
">f"
https://docs.python.org/3/library/struct.html
https://docs.python.org/3/library/codecs.html
相关问题 更多 >
编程相关推荐