<p>一些解决方案,全部并行处理超序列和子序列,占用线性时间和恒定内存</p>
<p>使用您的简单示例:</p>
<pre><code>full = iter(range(1, 16))
skip = iter([3,5,6,8,10,11])
</code></pre>
<p>解决方案0:(我最后想到的,但应该先做)</p>
<pre><code>s = next(skip, None)
for x in full:
if x == s:
s = next(skip, None)
else:
print(x)
</code></pre>
<p>解决方案1:</p>
<pre><code>from heapq import merge
from itertools import groupby
for x, g in groupby(merge(full, skip)):
if len(list(g)) == 1:
print(x)
</code></pre>
<p>解决方案2:</p>
<pre><code>for s in skip:
for x in iter(full.__next__, s):
print(x)
for x in full:
print(x)
</code></pre>
<p>解决方案3:</p>
<pre><code>from functools import partial
until = partial(iter, full.__next__)
for s in skip:
for x in until(s):
print(x)
for x in full:
print(x)
</code></pre>
<p>解决方案4:</p>
<pre><code>from itertools import takewhile
for s in skip:
for x in takewhile(s.__ne__, full):
print(x)
for x in full:
print(x)
</code></pre>
<p>所有解决方案的输出:</p>
<pre><code>1
2
4
7
9
12
13
14
15
</code></pre>
<p>实际问题的解决方案0:</p>
<pre><code>import csv
import itertools
with open('in.txt') as tsvfile:
tsvreader = csv.reader(tsvfile, delimiter=' ')
skip = next(tsvreader, [None])[0]
for i in itertools.product('ACTG', repeat=18):
oneKmer = ''.join(i)
if oneKmer == skip:
skip = next(tsvreader, [None])[0]
else:
print(oneKmer)
</code></pre>
<p>轻微变化:</p>
<pre><code>import csv
from itertools import product
from operator import itemgetter
with open('in.txt') as tsvfile:
tsvreader = csv.reader(tsvfile, delimiter=' ')
skips = map(itemgetter(0), tsvreader)
skip = next(skips, None)
for oneKmer in map(''.join, product('ACTG', repeat=18)):
if oneKmer == skip:
skip = next(skips, None)
else:
print(oneKmer)
</code></pre>