搜索列表:只匹配精确的单词/string

2024-05-06 00:10:12 发布

您现在位置:Python中文网/ 问答频道 /正文

如何在搜索列表时精确匹配字符串/单词。我试过了,但不正确。下面我给出了sample listmy codetest results

list = ['Hi, hello', 'hi mr 12345', 'welcome sir']

我的代码:

for str in list:
  if s in str:
    print str

测试结果:

s = "hello" ~ expected output: 'Hi, hello' ~ output I get: 'Hi, hello'
s = "123" ~ expected output: *nothing* ~ output I get: 'hi mr 12345'
s = "12345" ~ expected output: 'hi mr 12345' ~ output I get: 'hi mr 12345'
s = "come" ~ expected output: *nothing* ~ output I get: 'welcome sir'
s = "welcome" ~ expected output: 'welcome sir' ~ output I get: 'welcome sir'
s = "welcome sir" ~ expected output: 'welcome sir' ~ output I get: 'welcome sir'

我的列表包含超过200K个字符串


Tags: 字符串inhello列表outputgethilist
3条回答
>>> l = ['Hi, hello', 'hi mr 12345', 'welcome sir']
>>> search = lambda word: filter(lambda x: word in x.split(),l)
>>> search('123')
[]
>>> search('12345')
['hi mr 12345']
>>> search('hello')
['Hi, hello']

如果搜索完全匹配:

for str in list:
  if set (s.split()) & set(str.split()):
    print str

看来您不仅需要执行一次此搜索,因此我建议您将列表转换为字典:

>>> l = ['Hi, hello', 'hi mr 12345', 'welcome sir']
>>> d = dict()
>>> for item in l:
...     for word in item.split():
...             d.setdefault(word, list()).append(item)
...

所以现在你可以很容易地做到:

>>> d.get('hi')
['hi mr 12345']
>>> d.get('come')    # nothing
>>> d.get('welcome')
['welcome sir']

p.s.可能需要改进item.split()来处理逗号、指针和其他分隔符。可以使用regex和\w

p.p.s.正如库拉里安所说,这与“欢迎先生”不符。如果要匹配整个字符串,它只是建议的解决方案的另一行。但如果必须匹配字符串中以空格和标点符号为界的部分,则应该选择regex

相关问题 更多 >