如何在乾草堆中找到針，有更好的解決方案嗎？

网友

1楼 · 编辑于 2024-10-02 16:33:50

我不认为这样就可以得到下面的O(n)（因为您需要至少在字符串中迭代一次）。你可以做一些优化。在

我假设您想要匹配“整个单词”，例如查找foo应该如下匹配：

foo and foo, or foobar and not foo.
^^^     ^^^                    ^^^

因此，仅仅基于空间的夹板不会起作用，因为：

^{pr2}$

这就是^{} module派上用场的地方，它将允许您构建引人入胜的条件。例如，regexp中的\b表示：

Matches the empty string, but only at the beginning or end of a word. A word is defined as a sequence of Unicode alphanumeric or underscore characters, so the end of a word is indicated by whitespace or a non-alphanumeric, non-underscore Unicode character. Note that formally, \b is defined as the boundary between a \w and a \W character (or vice versa), or between \w and the beginning/end of the string. This means that r'\bfoo\b' matches 'foo', 'foo.', '(foo)', 'bar foo baz' but not 'foobar' or 'foo3'.

因此r'\bfoo\b'将只匹配整个单词foo。也不要忘记使用^{}：

>>> re.escape('foo.bar+')
'foo\\.bar\\+'
>>> r'\b{}\b'.format(re.escape('foo.bar+'))
'\\bfoo\\.bar\\+\\b'

现在只需使用^{}扫描字符串。根据文件：

Return an iterator yielding match objects over all non-overlapping matches for the RE pattern in string. The string is scanned left-to-right, and matches are returned in the order found. Empty matches are included in the result unless they touch the beginning of another match.

我假设匹配项是动态生成的，因此它们永远不必一次存储在内存中（这对于大的字符串和许多匹配项很有用）。最后数一数：

>>> r = re.compile(r'\bfoo\b')
>>> it = r.finditer('foo and foo, or foobar and not foo.')
>>> sum(1 for _ in it)
3

网友

2楼 · 编辑于 2024-10-02 16:33:50

您可以使用Counter

from collections import Counter

def find_needle(n,h):
    return Counter(h.split())[n]

即：

^{pr2}$

输出：

DEMO

网友

3楼 · 编辑于 2024-10-02 16:33:50

这并没有解决复杂性问题，但简化了代码：

def find_needle(n,h):
    return h.split().count(n)

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何在乾草堆中找到針，有更好的解決方案嗎？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >