regex：带可选部分的字符串

Test if a column field is larger than a given value This function can also be called as an operator using the '>' syntax Arguments: - DbColumn self - string or float value: the value to compare to in case of string: lexicographic comparison in case of float: numeric comparison Returns: DbWhere object

2条回答

网友

1楼 · 编辑于 2024-05-06 01:29:57

如果要匹配可选的Arguments:和Returns:节、和之后的文本，则不想使用(?P<name>...)来命名捕获组，还可以使用(?:...)这是常规括号的非捕获版本。

正则表达式如下所示：

m = re.search('^(.*?)(?:Arguments:(.*?))?(?:Returns:(.*?))?$', doc, re.DOTALL)
#                     ^^                  ^^

根据Python3documentation：

(?:...)
A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.

网友

2楼 · 编辑于 2024-05-06 01:29:57

试用：

re.search('^(.*?)(Arguments:.*?)?(Returns:.*)?$', s, re.DOTALL)

只需通过附加一个?，使第二组和第三组可选，并通过（再次）在前两组的限定符上附加一个?，使前两组的限定符非贪婪（是的，令人困惑）。

另外，如果在模式的第一组上使用非贪婪修饰符，它将匹配可能最短的子字符串，对于.*来说，它是空字符串。您可以通过在模式末尾添加行尾字符（$）来克服此问题，这将强制第一组匹配尽可能少的字符以满足模式，即，当没有Arguments和Returns节时，以及这些节之前的所有内容（如果存在）时，整个字符串都将匹配。

编辑：好的，如果您只想捕获标记Arguments:和Returns:之后的文本，那么您将不得不在多个组中进行折叠。我们不会使用所有的组，所以用<?P<name>符号命名它们（另一个问号，argh！）-开始有意义了：

>>> m = re.search('^(?P<description>.*?)(Arguments:(?P<arguments>.*?))?(Returns:(?P<returns>.*))?$', s, re.DOTALL)
>>> m.groupdict()['description']
"Test if a column field is larger than a given value\n    This function can also be called as an operator using the '>' syntax\n\n    "
>>> m.groupdict()['arguments']
'\n        - DbColumn self\n        - string or float value: the value to compare to\n            in case of string: lexicographic comparison\n            in case of float: numeric comparison\n    '
>>> m.groupdict()['returns']
'\n        DbWhere object'
>>>

相关问题更多 >

编程相关推荐

热门问题

热门文章