Python中的正则表达式重新拆分和图案

2024-10-02 10:34:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我有这样一根线:

string ='ArcelorMittal invests =E2=82=AC87m in new process that cuts emissions=20'

我想拿出=E2=82=AC=20

但当我使用

pattern ='(=\w\w)+'
a=re.split(pattern,string)

它回来了

['ArcelorMittal invests ', '=AC', '87m in new process that cuts emissions', '=20', '']

Tags: inrenewstringthatprocessacpattern
2条回答

基于your comment,我建议您对原始字符串使用^{}。不需要提取这些字符并分别解码

>>> import quopri
>>> s = 'ArcelorMittal invests =E2=82=AC87m in new process that cuts emissions=20'
>>> quopri.decodestring(s)
'ArcelorMittal invests \xe2\x82\xac87m in new process that cuts emissions '
>>> print quopri.decodestring(s)
ArcelorMittal invests €87m in new process that cuts emissions

你可以用re.findall

>>> s = 'ArcelorMittal invests =E2=82=AC87m in new process that cuts emissions=20'
>>> re.findall(r'(?:=\w{2})+', s)
['=E2=82=AC', '=20']
>>> 

如果要删除这些字符,请使用re.sub。你知道吗

>>> re.sub(r'(?:=\w{2})+', '', s)
'ArcelorMittal invests 87m in new process that cuts emissions'

相关问题 更多 >

    热门问题