用字符串匹配列表中的多个单词问题的回答

用字符串匹配列表中的多个单词

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

您可以使用tride和true<a href="https://docs.python.org/3/library/re.html" rel="nofollow noreferrer">re</a>库。你知道吗 <pre><code>import re from collections import OrderedDict def get_matches(s, keys, include_duplicates=False): pattern = re.compile('|'.join(map(re.escape, keys))) all_matches = pattern.findall(s, re.IGNORECASE) if not include_duplicates: all_matches = list(OrderedDict.fromkeys(all_matches).keys()) return all_matches </code></pre> 这是非常多样化的，因为不需要担心检索无序的匹配（感谢<code>dict.fromkeys</code>）。您可以选择在响应中包含重复项。你知道吗 <hr/> <h2>解释</h2> 我对re所做的就是创建一个模式来查找<code>keywords</code>*（<code>keys)* seperated by a</code>| <code>this tells</code>re`中的每个字符串，以查找所有匹配的键。你知道吗 <a href="https://docs.python.org/3/library/re.html#re.findall" rel="nofollow noreferrer">re.findall</a>按文档中说明的顺序返回匹配项： <blockquote> Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. </blockquote> 这不考虑重复项，因此<code>include_duplicates</code>参数包含在需要它们的情况下。您可以将结果转换成一个集合来删除重复项，尽管这样会丢失顺序完整性，因此我使用<a href="https://docs.python.org/3/library/collections.html" rel="nofollow noreferrer">collections.OrderedDict</a>并将其转换回一个列表。你知道吗 <hr/> <h2>投入使用：</h2> <pre><code>text = "there is a car accident on the freeway so that why I am late for the show." keywords= { "freeway", "doesn't turn on", "dropped", "got sick", "traffic jam", " car accident"} matches = get_matches(text, keywords) print(f"the list of matched words are: {', '.join(matches)}") #the list of matched words are: car accident, freeway, freeway </code></pre> 你可以自己试试<a href="https://repl.it/repls/AbleEssentialDribbleware" rel="nofollow noreferrer">https://repl.it/repls/AbleEssentialDribbleware</a>。你知道吗 编辑 正如您在评论中所要求的： 要解释这条线的作用： <pre><code>pattern = re.compile('|'.join(map(re.escape, keys))) </code></pre> <ul> <li><code>re.compile</code>-从字符串生成正则表达式模式。-<a href="https://docs.python.org/3/library/re.html#re.compile" rel="nofollow noreferrer">see the docs</a></li> <li><code>join</code>接受一个字符串的iterable，并使其中一个字符串都被前面的字符串隔开。-<a href="https://docs.python.org/3/library/stdtypes.html#str.join" rel="nofollow noreferrer">see the docs</a></li> <li><code>map</code>&amp；<code>re.escape</code>您可以将此内容用于您的案例但是如果您或任何阅读此内容的人正在使用更复杂的关键字搜索，则此操作将获取每个关键字并转义<code>re</code>的特殊元字符-（请参阅文档：<a href="https://docs.python.org/3/library/functions.html#map" rel="nofollow noreferrer">map</a>，<a href="https://docs.python.org/3/library/re.html#re.escape" rel="nofollow noreferrer">re.escape</a>）</li> </ul> 这行可以在没有<code>map</code>和<code>re.escape</code>的情况下重写，并且仍然可以像这样正常工作： <pre><code>pattern = re.compile('|'.join(keys)) </code></pre> 只知道不能包含这样的字符：<code>(</code>或<code>*</code>等。。。在你的关键词里。你知道吗

用字符串匹配列表中的多个单词

1 个回答

相关Python问题