用字符串匹配列表中的多个单词

3条回答

网友

1楼 · 编辑于 2024-10-01 01:33:30

[i for i in keywords if i in x]

编辑：这就是你想要的

网友

2楼 · 编辑于 2024-10-01 01:33:30

可以使用集合找出用户输入的字符串与关键字之间的匹配字符串。你知道吗

检查以下代码：

keywords= ["freeway", "doesn't turn on", "dropped", "got sick", "traffic jam", " car accident"]

user_strings = []

while True:
    x = input("Enter a string?")
    if x == 'exit':
        break
    user_strings.append(x)

print ("User strings = %s" %(user_strings))
print ("keywords = %s" %(keywords))

print ("Matched Words = %s" %(list(set(keywords) & set(user_strings))))

输出：

Enter a string?"doesn't turn on"
Enter a string?"freeway"
Enter a string?"Hello"
Enter a string?"World"
Enter a string?"exit"
User strings = ["doesn't turn on", 'freeway', 'Hello', 'World']
keywords = ['freeway', "doesn't turn on", 'dropped', 'got sick', 'traffic jam', ' car accident']
Matched Words = ['freeway', "doesn't turn on"]

网友

3楼 · 编辑于 2024-10-01 01:33:30

您可以使用tride和truere库。你知道吗

import re
from collections import OrderedDict

def get_matches(s, keys, include_duplicates=False):
    pattern = re.compile('|'.join(map(re.escape, keys)))
    all_matches = pattern.findall(s, re.IGNORECASE)

    if not include_duplicates:
        all_matches = list(OrderedDict.fromkeys(all_matches).keys())
    return all_matches

这是非常多样化的，因为不需要担心检索无序的匹配（感谢dict.fromkeys）。您可以选择在响应中包含重复项。你知道吗

解释

我对re所做的就是创建一个模式来查找keywords*（keys)* seperated by a| this tellsre`中的每个字符串，以查找所有匹配的键。你知道吗

re.findall按文档中说明的顺序返回匹配项：

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found.

这不考虑重复项，因此include_duplicates参数包含在需要它们的情况下。您可以将结果转换成一个集合来删除重复项，尽管这样会丢失顺序完整性，因此我使用collections.OrderedDict并将其转换回一个列表。你知道吗

投入使用：

text = "there is a car accident on the freeway so that why I am late for the show."
keywords= {
  "freeway",
  "doesn't turn on",
  "dropped",
  "got sick",
  "traffic jam",
  " car accident"}
matches = get_matches(text, keywords)
print(f"the list of matched words are: {', '.join(matches)}")
#the list of matched words are:  car accident, freeway, freeway

你可以自己试试https://repl.it/repls/AbleEssentialDribbleware。你知道吗

编辑

正如您在评论中所要求的：

要解释这条线的作用：

pattern = re.compile('|'.join(map(re.escape, keys)))

re.compile-从字符串生成正则表达式模式。-see the docs
join接受一个字符串的iterable，并使其中一个字符串都被前面的字符串隔开。-see the docs
map&；re.escape您可以将此内容用于您的案例但是如果您或任何阅读此内容的人正在使用更复杂的关键字搜索，则此操作将获取每个关键字并转义re的特殊元字符-（请参阅文档：map，re.escape）

这行可以在没有map和re.escape的情况下重写，并且仍然可以像这样正常工作：

pattern = re.compile('|'.join(keys))

只知道不能包含这样的字符：(或*等。。。在你的关键词里。你知道吗

解释

投入使用：

相关问题更多 >

编程相关推荐

热门问题

热门文章