在下面的输入中,我试图分别用''
和' '
替换数字和\n
THE SONNETS\n\n 1\n\nFrom fairest creatures we desire increase,\nThat thereby beauty’s rose might never die,\nBut as the riper should by time decease,\nHis
she hies, 1189\nAnd yokes her silver doves; by whose swift aid\nTheir mistress mounted through the empty skies,\nIn her light chariot quickly is convey’d; 1192\n Holding their course to Paphos, where their queen\n Means to immure herself and not be seen.\n'
从包含上述内容的文件中读取input_var
file_name = 'sample.txt'
file = open(folder+file_name, mode='r', encoding='utf8')
input_var = file.read()
file.close
文件中的数据是
THE SONNETS
1
From fairest creatures we desire increase,
That thereby beauty’s rose might never die,
But as the riper should by time decease,
His
she hies, 1189
And yokes her silver doves; by whose swift aid
Their mistress mounted through the empty skies,
In her light chariot quickly is convey’d; 1192
Holding their course to Paphos, where their queen
Means to immure herself and not be seen.
为了识别数字,我使用了正则表达式[\s]{3,}\d{1,}\\n
(在数字之前必须至少有3个空格。(有关正则表达式的测试,请参见this link)
我使用下面的代码来替换正则表达式和\n
,这两个都是我在stackoverflow中得到的答案
代码1-
# Remove the numbers in sonnets and at the end of lines
pattern = {r'[\s]{3,}\d{1,}\\n' : '',
r'\\n' : ' '
}
regex = re.compile('|'.join(map(re.escape, pattern.keys( ))))
output_var = regex.sub(lambda match: pattern[match.group(0)], input_var)
代码2-
rep = dict((re.escape(k), v) for k, v in pattern.items())
pattern_test = re.compile("|".join(rep.keys()))
output_var = pattern_test.sub(lambda m: rep[re.escape(m.group(0))], input_var)
代码3-
for i, j in pattern.items():
output_var = input_var.replace(i, j)
其中input_var
包含上述文本。这三个文本都不能替换任何内容
我也试过了
pattern = {r'[\s]{3,}\d{1,}\n' : '',
r'\n' : ' '
}
但它不能取代任何东西
pattern = {'[\s]{3,}\d{1,}\n' : '',
'\n' : ' '
}
只替换\n
,输出如下
THE SONNETS 1 From fairest creatures we desire increase, That thereby beauty’s rose might never die, But as the riper should by time decease, His
正则表达式在字典中没有标识,我认为它被视为文字字符串而不是正则表达式。如何在字典中指定正则表达式?我在stackoverflow中找到的答案使用字符串而不是正则表达式,就像为this question提供的答案一样
预期的结果是
THE SONNETS From fairest creatures we desire increase, That thereby beauty’s rose might never die, But as the riper should by time decease, His
she hies,And yokes her silver doves; by whose swift aid Their mistress mounted through the empty skies, In her light chariot quickly is convey’d; Holding their course to Paphos, where their queen Means to immure herself and not be seen. '
这里有一个可以运行的示例(如果您有bs4等)。我看到您在编号和正则表达式方面得到了帮助,但这可能有助于理解行返回等(不完全确定目标是什么)。在web上找不到与您的源代码编号相似的源代码,因此很遗憾,这不是like for like。如果没有别的,也许值得思考
产出:
您需要在循环中运行
re.sub
,但请确保output_var
已初始化为input_var
值:见Python demo online:
输出:
相关问题 更多 >
编程相关推荐