替换多个正则表达式模式

2024-09-26 22:08:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个很长的字符串,我想替换几十个正则表达式,所以我创建了一个字典,如下所示:

replacements = { r'\spunt(?!\s*komma)' : r".",
                 r'punt komma' : r",",
                 r'(?<!punt )komma' : r",",
                 "paragraaf" : "\n\n" }

上面的字典是一个小选择。你知道吗

如何将此应用于字符串文档?示例字符串:

text = ""a punt komma is in this case not a komma and thats it punt"

我试过这样的方法:

import re 

def multiple_replace(dict, text):
  # Create a regular expression  from the dictionary keys
  regex = re.compile("(%s)" % "|".join(map(re.escape, dict.keys())))

  # For each match, look-up corresponding value in dictionary
  return regex.sub(lambda mo: dict[mo.string[mo.start():mo.end()]], text) 

if __name__ == "__main__": 

  text = "Larry Wall is the creator of Perl"

  dict = {
    "Larry Wall" : "Guido van Rossum",
    "creator" : "Benevolent Dictator for Life",
    "Perl" : "Python",
  } 

  print(multiple_replace(dict, text))

但这只适用于字符串替换,而不是像我的字典那样的正则表达式模式。你知道吗


Tags: the字符串textinredictionary字典is
1条回答
网友
1楼 · 发布于 2024-09-26 22:08:50

迭代字典,然后使用每个键、值对进行替换:

replacements = { r'\spunt(?!\s*komma)' : r".",
                 r'punt komma' : r",",
                 r'(?<!punt )komma' : r",",
                 "paragraaf" : "\n\n" }

text = "a punt komma is in this case not a komma and thats it punt"
print(text)

for key, value in replacements.items():
    text = re.sub(key, value, text)

print(text)

这将输出:

a punt komma is in this case not a komma and thats it punt
a , is in this case not a , and thats it.

请注意,您可能应该在每个关键字regex项周围设置单词边界\b,以避免无意中匹配子字符串。你知道吗

相关问题 更多 >

    热门问题