如何使用python中的字符串列表在的段落中进行精确匹配问题的回答

如何使用python中的字符串列表在的段落中进行精确匹配

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

听起来您基本上希望匹配的开始和结束要么是段落的结尾，要么是到空格字符的转换（“单词”的结尾，尽管遗憾的是，单词的正则表达式定义排除了像<code>.</code>这样的内容，所以您不能使用基于<code>\b</code>的测试） 这里最简单的方法是用空格分割行，然后查看您的字符串是否出现在结果<code>list</code>（使用<a href="https://stackoverflow.com/q/10106901/364696">finding a sublist in a ^{<cd3>}</a>上的一些变体）： <pre><code>def list_contains_sublist(haystack, needle): firstn, *restn = needle # Extracted up front for efficiency for i, x in enumerate(haystack, 1): if x == firstn and haystack[i:i+len(restn)] == restn: return True return False para_words = paragraph.split() def checkIfProdExist(x): return list_contains_sublist(para_words, x.split()) </code></pre> 如果您也需要索引，或者需要精确的空格匹配，那么它就更复杂了（<code>.split()</code>不会保留空格的运行，因此您无法重建索引，如果您对整个字符串进行索引，并且子字符串出现两次，但只有第二次满足您的要求，那么您可能会得到错误的索引）。在这一点上，我可能会使用正则表达式： <pre><code>import re def checkIfProdExist(x): m = re.search(fr'(^|\s){re.escape(x)}(?=\s|$)', paragraph) if m: return m.end(1) # After the matched space, if any return -1 # Or omit return for implicit None, or raise an exception, or whatever </code></pre> 请注意，如前所述，这不适用于<code>filter</code>（如果段落以子字符串开头，则返回<code>0</code>，即falsy）。您可能会让它在失败时返回<code>None</code>，在成功时返回<code>tuple</code>个索引，因此它在布尔值和索引要求较高的情况下都有效，例如（演示海象使用3.8+的乐趣）： <pre><code>def checkIfProdExist(x): if m := re.search(fr'(?:^|\s)({re.escape(x)})(?=\s|$)', paragraph): return m.span(1) # We're capturing match directly to get end of match easily, so we stop capturing leading space and just use span of capture # Implicitly returns falsy None on failure </code></pre>

如何使用python中的字符串列表在的段落中进行精确匹配

1 个回答

相关Python问题