擅长:python、mysql、java
<p>一种解决方案是使用两个索引(<code>start</code>和<code>stop</code>)迭代文档。您只需跟踪<code>searchTerms</code>中的每一个在<code>start</code>和{<cd2>}之间的数量。如果不是所有的都存在,则增加<code>stop</code>直到它们都出现(或者到达文档末尾)。当全部存在时,增加<code>start</code>,直到所有<code>searchTerms</code>不再存在之前。每当所有的<code>searchTerms</code>出现时,您都要检查该候选者是否比以前的候选者更好。这应该能够在<code>O(N)</code>时间内完成(搜索词的数量有限,或者搜索词被放入一个<code>O(1)</code>查找的集合中)。比如:</p>
<pre><code>start = 0
stop = 0
counts = dict()
cand_start = None
cand_end = None
while stop < len(document):
if len(counts) < len(searchTerms):
term = document[stop]
if term in searchTerms:
if term not in counts:
counts[term] = 1
else:
counts[term] += 1
else:
if cand_start is None or stop-start < cand_stop-cand_start:
cand_start = start
cand_stop = stop
term = document[start]
if term in counts:
if counts[start] == 1:
del counts[start]
else:
counts[start] -= 1
start += 1
</code></pre>