正则表达式匹配中间带有空格的单词

2024-10-01 07:13:46 发布

您现在位置:Python中文网/ 问答频道 /正文

我有这个regex模式([^\s|:]+):\s*([^\s|:]+),它对name:jones|location:london|age:23很好。我如何扩展regex模式以覆盖空格单词或与数字组合的单词,例如:full name:jones hardy|city and dialling code :london 0044|age:23 years

>>> ("full name", "jones hardy") ("city and dialling code", "london 0044")("age","23 years")

Tags: andnamecityage模式codelocation单词
3条回答
>>> s= "full name:jones hardy|city and dialling code :london 0044|age:23 years"
>>> r=r"([^|:]+?)\s*:\s*([^|:]+)"
>>> re.findall(r, s)
[('full name', 'jones hardy '), ('city and dialling code', 'london 0044'), ('age', '23 years')]

因此,'city and dialling code '结尾的空格将被消除。在

但如果在强制'|'之前有空格,则不会消除:

^{pr2}$

将是'jones hardy '结尾处的空格。在

编辑

r"\s*([\w\s]+?)\s*:\s*([\w\s]+?)\s*(?:\||$)"将消除目标字符串开头和结尾处的所有空格:

>>> s
'  full name: jones hardy | city and dialling code :london 0044|age:23 years'
>>> r=r"\s*([\w\s]+?)\s*:\s*([\w\s]+?)\s*(?:\||$)"
>>> re.findall(r, s)
[('full name', 'jones hardy'), ('city and dialling code', 'london 0044'), ('age', '23 years')]

这种情况似乎需要re.split。在

>>> s = "full name:jones hardy|city and dialling " \
...     "code :london 0044|age:23 years"
>>> [tuple(re.split('\s*:\s*', t))
...  for t in re.split('\s*\|\s*', s)]
[('full name', 'jones hardy'),
 ('city and dialling code', 'london 0044'),
 ('age', '23 years')]

简化正则表达式,以捕获除分隔符之外的所有内容,在本例中,分隔符是冒号:或管道|

>>> r = r"([^:|]+)\s*:\s*([^:|]+)"
>>> st = "full name:jones hardy|city and dialling code :london 0044"
>>> re.findall(r, st)
[('full name', 'jones hardy'), ('city and dialling code ', 'london 0044')]
>>> st="name:jones|location:london|age:23"
>>> re.findall(r, st)
[('name', 'jones'), ('location', 'london'), ('age', '23')]

相关问题 更多 >