在中使用Python字典进行文本替换re.sub公司

import re dict = {} in_file = open("in.txt", "r") outfile = open("out.txt", "w") File1 = in_file.read() infile1 = File1.replace("\n", " ") for mo in re.finditer(r'<p><su>(\d+)</su>(.*?)</p>',infile1): dict[mo.group(1)] = mo.group(2) subval = re.sub(r'<p><su>(\d+)</su>(.*?)</p>','',infile1) subval = re.sub('<su>(\d+)</su>',dict[\\1], subval) outfile.write(subval)

1条回答

网友

1楼 · 发布于 2024-06-28 20:35:19

首先，不要命名字典dict，否则会破坏dict函数。其次，\\1不能在字符串之外工作，因此出现语法错误。我认为最好的办法是利用str.format

import re

# store the substitutions
subs = {}

# read the data
in_file = open("in.txt", "r")
contents = in_file.read().replace("\n", " ")
in_file.close()

# save some regexes for later
ftnt_tag = re.compile(r'<ftnt>.*</ftnt>')
var_tag = re.compile(r'<p><su>(\d+)</su>(.*?)</p>')

# pull the ftnt tag out
ftnt = ftnt_tag.findall(contents)[0]
contents = ftnt_tag.sub('', contents)

# pull the su
for match in var_tag.finditer(ftnt):
    # added s so they aren't numbers, useful for format
    subs["s" + match.group(1)] = match.group(2)

# replace <su>1</su> with {s1}
contents = re.sub(r"<su>(\d+)</su>", r"{s\1}", contents)

# now that the <su> are the keys, we can just use str.format
out_file = open("out.txt", "w")
out_file.write( contents.format(**subs) )
out_file.close()

相关问题更多 >

编程相关推荐

热门问题

热门文章

在中使用Python字典进行文本替换re.sub公司

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >