从提取的正则表达式值中删除括号

2024-09-28 20:54:39 发布

您现在位置:Python中文网/ 问答频道 /正文

下面是我对regex的输入和结果:

temp2 = '(LEFT-WALL)(who)(is.v)(the)(di(rect)or.n)(of)(Inceptio)(RIGHT-WALL)'
print regex.findall(r'\([^\)\(]*+(?:(?R)[^\)\(]*)*+\)', temp2)

结果:

  ['(LEFT-WALL)', '(who)', '(is.v)', '(the)', '(di(rect)or.n)', '(of)', '(Inceptio)', '(RIGHT-WALL)']

我想要这样的结果:

 ['LEFT-WALL', 'who', 'is.v', 'the', 'di(rect)or.n', 'of', 'Inceptio', 'RIGHT-WALL']

正则表达式有变化吗


Tags: oroftherectrightisleftregex
3条回答

我觉得您提供的示例字符串不需要任何正则表达式:

temp2 = '(LEFT-WALL)(who)(is.v)(the)(di(rect)or.n)(of)(Inceptio)(RIGHT-WALL)'
if temp2[0:1] == "(" and temp2[-1:] == ")":
    print temp2[1:-1].split(")(")

sample program的输出:

['LEFT-WALL', 'who', 'is.v', 'the', 'di(rect)or.n', 'of', 'Inceptio', 'RIGHT-WALL'] 

作为一种不使用regex的替代方法,您可以使用str.split()str.strip()方法来完成这项工作:

>>> [i.strip('()') for i in temp2.split(')(')]
['LEFT-WALL', 'who', 'is.v', 'the', 'di(rect)or.n', 'of', 'Inceptio', 'RIGHT-WALL']

或者对于regex,您可以在regex中使用look-around

>>> re.findall(r'(?<=\()(.*?)(?=\)\(|\)$)', temp2)
['LEFT-WALL', 'who', 'is.v', 'the', 'di(rect)or.n', 'of', 'Inceptio', 'RIGHT-WALL']

注意逻辑很简单,您只需要匹配开括号(和紧跟开括号)(的闭括号之间的字符串

您需要匹配()(之间的字符串,或者)()之间的字符串。这样就避免了在'(di(rect)or.n)'中匹配像'(rect)'这样的字符串。您可以通过使用lookaround assertions来实现这一点,因为它们不使用所搜索的字符串

前瞻性断言

(?=...) Matches if ... matches next, but doesn’t consume any of the string. This is called a lookahead assertion. For example, Isaac (?=Asimov) will match 'Isaac ' only if it’s followed by 'Asimov'.

积极的回顾断言

(?<=...) Matches if the current position in the string is preceded by a match for ... that ends at the current position. This is called a positive lookbehind assertion. (?<=abc)def will find a match in abcdef, since the lookbehind will back up 3 characters and check if the contained pattern matches.

在下面的代码中,我使用re.VERBOSE标志使其更具可读性

pattern = re.compile(r"""

(?<=  \(  )   .+?  (?=  \)\(  )   # Matches string after a '(' and before a ')('

|                                 # or...

(?<=  \)\(  )   .+?  (?=  \)  )   # Matches string after a ')(' and before a ')'    

""", re.VERBOSE)


print (re.findall(pattern, temp2))

相关问题 更多 >