在正则表达式中包含半空间（\u200c）

2024-10-01 05:01:45 发布

男 | 程序猿一只，喜欢编程写python代码。

在regex中，例如在Python中使用：

WORD = re.compile(r'\w+')

然后使用：

^{pr2}$

我得到：

['This', 'is', 'a', 'test']

现在我想把half-space字符，它是\u200c作为普通的字母数字字符，所以如果我有：

w = 'This\u200cis a test'

然后当我运行WORD.findall(w)时，我得到：

['This\u200cis', 'a', 'test']

我怎么能做到呢？在

Tags： test re is 字母数字 space this 字符

1条回答

网友

1楼 · 发布于 2024-10-01 05:01:45

使用character classes除了\w（Python 3.x+），还包括{}：

>>> import re
>>> re.findall(r'[\u200c\w]+', 'This\u200cis a test')
['This\u200cis', 'a', 'test']

在Python 2.x中，需要使用unicode：

^{pr2}$