我正在尝试做一些我认为很简单的事情,但是我在正则表达式方面遇到了一些麻烦。
具体地说,我想在同一行中找到CAUGHT AN ERROR
及其后面的所有内容,并用CAUGHT AN ERROR: XXXXX
替换它。。我的理解是,使用.*$
(example)将允许我搜索到行尾,但使用for循环无法得到准确的替换。如何替换搜索的字符后的所有内容
1970-01-01 10:59:02
1970-01-01 10:59:02
1970-01-01 10:59:01 CAUGHT AN ERROR: rmv: cannot remove '/media/^Red^XXXXXX.jpg': No such file or directory; FROM: exec rm [file join $drive $newFile] (in USB::Write /)
1970-01-01 10:59:01 CAUGHT AN ERROR: rmv: cannot remove '/media/^Green^XXXXXX.jpg': No such file or directory; FROM: exec rm [file join $drive $newFile] (in USB::Write /media/ug)
1970-01-01 10:59:02 CAUGHT AN ERROR: rmv: cannot remove '/media/^Blue^XXXXXX.jpg': No such file or directory; FROM: exec rm [file join $drive $newFile] (in USB::Write /medi0349223^BradbuXXXXXX.jpg)
1970-01-01 10:59:02 CAUGHT AN ERROR: rmv: cannot remove '/media/^XXXXXX.jpg': No such file or directory; FROM: exec rm [file join $drive $newFile] (in USB::Write /media/usb0 XXXXXX.jpg)
1970-01-01 10:59:02 CAUGHT AN ERROR: rmv: cannot remove '/media/^Orange^XXXXXX.jpg': No such file or directory; FROM: exec rm [file join $drive $newFile] (in USB::Write )
1970-01-01 10:59:02
我将上述示例日志保存在一个文件中,然后执行以下代码:
with open(r'C:\Users\Downloads\LOG\sample.log', mode='r', encoding='utf8') as log_r:
content = log_r.read()
dict_items = {r'CAUGHT AN ERROR: [A-Z|a-z|0-9|\.|\-|\,|\_|\{|\}|\)|\(|\/]*\+': r'CAUGHT AN ERROR: XXXXXX'}
for k, v in dict_items.items():
content = re.sub(k, v, content)
print(content)
在我的字典里,我也试过,但没有用
r'CAUGHT AN ERROR: .\$'
r'CAUGHT AN ERROR: .*$'
预期结果
1970-01-01 10:59:02
1970-01-01 10:59:02
1970-01-01 10:59:01 CAUGHT AN ERROR: XXXXXX
1970-01-01 10:59:01 CAUGHT AN ERROR: XXXXXX
1970-01-01 10:59:02 CAUGHT AN ERROR: XXXXXX
1970-01-01 10:59:02 CAUGHT AN ERROR: XXXXXX
1970-01-01 10:59:02 CAUGHT AN ERROR: XXXXXX
1970-01-01 10:59:02
r'CAUGHT AN ERROR: .*$'
是正确的regexp。但是您需要使用re.MULTILINE
标志,以便$
匹配行的结尾,而不是整个字符串的结尾DEMO
相关问题 更多 >
编程相关推荐