如何将列表中的精确字符串与考虑到空格的较大字符串进行匹配?

2024-10-03 06:21:19 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个很大的字符串列表,我想检查一个字符串是否出现在一个较大的字符串中。该列表包含一个单词的字符串以及多个单词的字符串。为此,我编写了以下代码:

example_list = ['pain', 'chestpain', 'headache', 'sickness', 'morning sickness']
example_text = "The patient has kneepain as wel as a headache"

emptylist = []
for i in example_text:
    res = [ele for ele in example_list if(ele in i)]
    emptylist.append(res)

然而,这里的问题是“pain”也被添加到了emptylist中,这是不应该的,因为我只希望在与文本完全匹配的情况下添加示例列表中的内容。我还尝试使用集合:

word_set = set(example_list)
phrase_set = set(example_text.split())
word_set.intersection(phrase_set)

然而,这将“晨吐”分为“晨吐”和“呕吐”。有人知道解决这个问题的正确方法吗


Tags: 字符串textin列表forexampleas单词
3条回答

使用PyParsing:

import pyparsing as pp

example_list = ['pain', 'chestpain', 'headache', 'sickness', 'morning sickness']
example_text = "The patient has kneepain as wel as a headache morning sickness"

list_of_matches = []

for word in example_list:
  rule = pp.OneOrMore(pp.Keyword(word))
  for t, s, e in rule.scanString(example_text):
    if t:
      list_of_matches.append(t[0])

print(list_of_matches)

这将产生:

['headache', 'sickness', 'morning sickness']

您应该能够使用使用单词边界的正则表达式

>>> import re
>>> [word for word in example_list if re.search(r'\b{}\b'.format(word), example_text)]
['headache']

这将与'kneepain'中的'pain'不匹配,因为它不是以单词边界开始的。但它会正确地匹配包含空格的子字符串

成员们在这篇文章中已经提供了很好的例子

我对疼痛不止一次的匹配文本进行了挑战。我还想了解更多关于比赛地点的信息。我最终得到了以下代码

我写了下面的句子

"The patient has not only kneepain but headache and arm pain, stomach pain and sickness"
import re
from collections import defaultdict

example_list = ['pain', 'chestpain', 'headache', 'sickness', 'morning sickness']
example_text = "The patient has not only kneepain but headache and arm pain, stomach pain and sickness"

TruthFalseDict = defaultdict(list)
for i in example_list:
    MatchedTruths = re.finditer(r'\b%s\b'%i, example_text)
    if MatchedTruths:
        for j in MatchedTruths:
            TruthFalseDict[i].append(j.start())

print(dict(TruthFalseDict))

上面给出了以下输出

{'pain': [55, 69], 'headache': [38], 'sickness': [78]}

相关问题 更多 >