查找字符串中每个子字符串的所有匹配项问题的回答

查找字符串中每个子字符串的所有匹配项

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

虽然jurgenreza已经解释了为什么您的程序不起作用，但解决方案仍然相当缓慢。如果您只检查子串<code>s</code>，并且知道<code>s[:-1]</code>重复出现，那么您将得到一个更快的解决方案（通常快100倍甚至更多）： <pre><code>from collections import defaultdict def pfind(prefix, sequences): collector = defaultdict(list) for sequence in sequences: collector[sequence[0]].append(sequence) for item, matching_sequences in collector.items(): if len(matching_sequences) >= 2: new_prefix = prefix + item yield (new_prefix, len(matching_sequences)) for r in pfind(new_prefix, [sequence[1:] for sequence in matching_sequences]): yield r def find_repeated_substrings(s): s0 = s + " " return pfind("", [s0[i:] for i in range(len(s))]) </code></pre> 如果你想要口述，你可以这样称呼它： ^{pr2}$ 在我的机器上，运行2247个元素需要0.02秒，而原始的（修正的）解决方案需要12.72秒。在 （请注意，这是一个相当幼稚的实现；使用索引而不是子字符串应该更快。） 编辑：以下变量适用于其他序列类型（不仅仅是字符串）。而且，它不需要哨兵。在 <pre><code>from collections import defaultdict def pfind(s, length, ends): collector = defaultdict(list) if ends[-1] >= len(s): del ends[-1] for end in ends: if end < len(s): collector[s[end]].append(end) for key, matching_ends in collector.items(): if len(matching_ends) >= 2: end = matching_ends[0] yield (s[end - length: end + 1], len(matching_ends)) for r in pfind(s, length + 1, [end + 1 for end in matching_ends if end < len(s)]): yield r def find_repeated_substrings(s): return pfind(s, 0, list(range(len(s)))) </code></pre> 这仍然存在很长的子串将超过递归深度的问题。您可能需要捕获异常。在

查找字符串中每个子字符串的所有匹配项

1 个回答

相关Python问题