如何从字符串中删除一些字符。replace（）没有

# -*- coding: utf-8 from prestapyt import PrestaShopWebService from xml.etree import ElementTree prestashop = PrestaShopWebService('http://localhost/prestashop/api', 'key') prestashop.debug = True name = ElementTree.tostring(prestashop.search('products', options= {'display': '[name]', 'filter[id]': '[2]'}), encoding='cp852', method='text') print name print name.replace('ł', 'l')

2条回答

网友

1楼 · 编辑于 2024-09-30 16:40:56

您正在将编码与字节字符串混合在一起。下面是一个简单的工作示例，再现了这个问题。我假设您在一个默认为cp852编码的Windows控制台中运行：

#!python2
# coding: utf-8
from xml.etree import ElementTree as et
name_element = et.Element('data')
name_element.text = u'Naturalne mydło odświeżające'
name = et.tostring(name_element,encoding='cp852', method='text')
print name
print name.replace('ł', 'l')

输出（不更换）：

^{pr2}$

原因是，name字符串编码在cp852中，而字节字符串常量'ł'是用{}的源代码编码的。在

print repr(name)
print repr('ł')

输出：

'Naturalne myd\x88o od\x98wie\xbeaj\xa5ce'
'\xc5\x82'

最好的解决方案是使用Unicode字符串：

#!python2
# coding: utf-8
from xml.etree import ElementTree as et
name_element = et.Element('data')
name_element.text = u'Naturalne mydło odświeżające'
name = et.tostring(name_element,encoding='cp852', method='text').decode('cp852')
print name
print name.replace(u'ł', u'l')
print repr(name)
print repr(u'ł')

输出（更换）：

Naturalne mydło odświeżające
Naturalne mydlo odświeżające
u'Naturalne myd\u0142o od\u015bwie\u017caj\u0105ce'
u'\u0142'

注意python3的et.tostring有一个Unicode选项，字符串常量默认为Unicode。字符串的repr()版本也更可读，但是ascii()实现了旧的行为。您还将发现python3.6甚至可以在不使用波兰语代码页的控制台上打印波兰语，所以您可能根本不需要替换这些字符。在

#!python3
# coding: utf-8
from xml.etree import ElementTree as et
name_element = et.Element('data')
name_element.text = 'Naturalne mydło odświeżające'
name = et.tostring(name_element,encoding='unicode', method='text')
print(name)
print(name.replace('ł','l'))
print(repr(name),repr('ł'))
print(ascii(name),ascii('ł'))

输出：

Naturalne mydło odświeżające
Naturalne mydlo odświeżające
'Naturalne mydło odświeżające' 'ł'
'Naturalne myd\u0142o od\u015bwie\u017caj\u0105ce' '\u0142'

网友

2楼 · 编辑于 2024-09-30 16:40:56

如果我正确理解您的问题，您可以使用^{}：

>>> from unidecode import unidecode
>>> unidecode("Naturalne mydło odświeżające")
'Naturalne mydlo odswiezajace'

您可能需要先用name.decode('utf_8')对cp852编码的字符串进行解码。在

相关问题更多 >

编程相关推荐

热门问题

热门文章