正则表达式:匹配任意单词前面任意数量的副词

2024-10-04 09:31:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一组字符串,其形式如下,X表示任意单词

This is a string ((X.address)) test
This is a string ((X address)) test
This is a string (X address) test
This is a string (X.address) test

一旦X.addressX address被找到(包括前面的paranethes),我就要删除字符串的所有内容

This is a string
This is a string
This is a string
This is a string

这是我的出发点:

regex = r"\(X.address"
s = "This is a string ((X.address)) test"
re.split(regex, s)[0]

>> 'This is a string ('

它是有效的,但我需要对它进行概括,以便它搜索任意单词而不是X,并考虑单词前面的一个或多个偏执词。你知道吗


Tags: 字符串testre内容stringisaddressthis
3条回答

你可以用

re.sub(r'\s*\(+[^()]*\baddress.*', '', s, flags=re.S)

细节

  • \s*-0+空格
  • \(+-1+(字符
  • [^()]*-除()之外的任何0+字符
  • \b-单词边界(address前面不能有其他字母、数字或下划线)
  • address-一个单词
  • .*-字符串末尾的任何0+个字符。你知道吗

参见Python demo

import re
strs = [ 'This is a string ((X.address)) test', 'This is a string ((X address)) test', 'This is a string (X address) test', 'This is a string (X.address) test', 'This is a string ((X and Y and Z address)) test' ]
for s in strs:
    print(s, '=>', re.sub(r'\s*\(+[^()]*\baddress.*', '', s, flags=re.S))

输出:

This is a string ((X.address)) test => This is a string
This is a string ((X address)) test => This is a string
This is a string (X address) test => This is a string
This is a string (X.address) test => This is a string
This is a string ((X and Y and Z address)) test => This is a string

你可以.+(?=\s\(+X(?:\.|\s)address)

说明:

.+-匹配一个或多个字符

(?=...)-正面展望

\s-空白

\(+-匹配一个或多个(

X-按字面意思匹配X

(?:...)-非捕获组

\.|\s-匹配点.或空格

address-按字面意思匹配address

Demo

使用

regex = r"(This is a string)\s+\(+.+\)"
s = "This is a string ((X.address)) test"
re.split(regex, s)[1]

相关问题 更多 >