python移除匹配的正则表达式方括号

2024-10-04 15:26:28 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个Latex文件,其中很多文本都用\red{}标记,但是\red{}内也可能有括号,比如\red{here is \underline{underlined} text}。我想去掉红色,在google上搜索了一下之后,我写了这个python脚本:

import os, re, sys
#Start program in terminal with
#python RedRemover.py filename
#sys.argv[1] then has the value filename
ifn = sys.argv[1]
#Open file and read it
f = open(ifn, "r")
c = f.read() 
#The whole file content is now stored in the string c
#Remove occurences of \red{...} in c
c=re.sub(r'\\red\{(?:[^\}|]*\|)?([^\}|]*)\}', r'\1', c)
#Write c into new file
Nf=open("RedRemoved_"+ifn,"w")
Nf.write(c)

f.close()
Nf.close()

但这会改变

\red{here is \underline{underlined} text}

here is \underline{underlined text}

这不是我想要的。我想要

here is \underline{underlined} text


Tags: thetextinrehereissysred
2条回答

我认为你需要保留卷发,考虑一下这个例子:\red{\bf test}

import re

c = r'\red{here is \underline{underlined} text} and \red{more}'
d = c 

# this may be less painful and sufficient, and even more correct
c = re.sub(r'\\red\b', r'', c)
print "1ST:", c

# if you want to get rid of the curlies:
d = re.sub(r'\\red{([^{]*(?:{[^}]*}[^}]*)*)}', r'\1', d)
print "2ND:", d

给出:

^{pr2}$

不能将嵌套方括号的未确定级别与re module匹配,因为它不支持递归。要解决这个问题,可以使用new regex module

import regex

c = r'\red{here is \underline{underlined} text}'

c = regex.sub(r'\\red({((?>[^{}]+|(?1))*)})', r'\2', c)

其中(?1)是对捕获组1的递归调用。在

相关问题 更多 >

    热门问题