Python没有正确解释regex

2024-09-28 22:19:53 发布

您现在位置:Python中文网/ 问答频道 /正文

我有这个密码:

import re

regex = re.compile("(.+?)\1+")
results = regex.findall("FFFFFFF")
print(results)

预期结果是:

['F']

根据regexpal,正则表达式正在做它应该做的事情(寻找最短的重复子串)。但是在pythonthe result is ^{}中尝试正则表达式时。为什么会这样?你知道吗


Tags: importre密码isresult事情resultsregex
3条回答

使用原始字符串:

regex = re.compile(r"(.+?)\1+")

或者避开反斜杠:

regex = re.compile(r"(.+?)\\1+")

试试看

regex = re.compile(r"(.+?)\1+")

为什么不起作用?你可以理解

print r"(.+?)\1+"
print "(.+?)\1+"

What does preceding a string literal with "r" mean?

使用原始字符串:

>>> re.findall("(.+?)\1+", "FFFFFFF")
[]
>>> re.findall(r"(.+?)\1+", "FFFFFFF")
['F']
>>> 

原始字符串文字,即前缀为'r'的字符串文字,使反斜杠被视为文字。反斜杠被视为转义序列。你知道吗

引用^{} — Regular expression operations

Regular expressions use the backslash character ('\') to indicate special forms or to allow special characters to be used without invoking their special meaning. ...

The solution is to use Python’s raw string notation for regular expression patterns; backslashes are not handled in any special way in a string literal prefixed with 'r'. So r"\n" is a two-character string containing '\' and 'n', while "\n" is a one-character string containing a newline. Usually patterns will be expressed in Python code using this raw string notation.

相关问题 更多 >