我正在尝试使用正则表达式来查找在“field”中找到的类似的“保序”模式/单词,并使用它来查找“text”中的匹配项。我想编写这样一个正则表达式,它也会像下面的示例1一样找到部分匹配
也许,使单词可选是一种方法,但随后它开始匹配任意内容
我想让你们帮我写一个正则表达式,它取“field”并用它生成一个正则表达式,然后在“text”中找到该模式。部分匹配也可以。
两个字符串输入都可以是任何内容,正则表达式应该足够通用,可以处理任何内容
如果需要,请提出澄清问题! 任何你能指出的我出错的观察/方向都会非常有帮助
def regexp(field, text):
import re
key = re.split('\W', field)
regex= "^.*"
for x in key:
if len(x)>0:
#regex += "("+x+")?"
regex += x
regex += ".*"
regex = r'{}'.format(regex)
pattern = re.compile(regex, re.IGNORECASE)
matches = list(re.finditer(pattern, text))
print(matches, "\n", pattern)
if len(matches)>0:
return True
else:
return False
示例:
print(regexp("F1 gearbox: 0-400 m","f1 gearbox")) # this should match
#this is a partial match, my regex should be able to find this match
print(regexp("0-100 kmph" , "100-0 kmph")) # this should not match
#order of characters/words in my regex/text should match
print(regexp("F1 gearbox: 0-400 m","none")) # this should not match
#if i try use "(word)?" in my regex then everything becomes optional #and it starts to match random words like "none","sbhsuckjcsak", etc. #this obviously is not expected.
print(regexp("Combined* (ECE+EUDC) (l/100 km)","combined ece eudc")) #this should match
#because its a partial match and special characters are not important #for my matching usecase
您的函数已经为您发布的示例返回了正确的值。您只需确定“文本”和“字段”的顺序。我还使代码更短(至少在我看来)更易于阅读:
相关问题 更多 >
编程相关推荐