擅长:python、mysql、java
<p>假设这个想法是在每种酶上分裂,在酶是多个字母的索引点,而分裂,本质上是在两个字母之间。不需要正则表达式。在</p>
<p>您可以通过查找匹配项并在正确的索引处插入拆分指示符,然后对结果进行后期处理以实际拆分。在</p>
<p>例如:</p>
<pre><code>def digestfragmentwithenzyme(seqs, enzymes):
# preprocess enzymes once, then apply to each sequence
replacements = []
for enzyme in enzymes:
replacements.append((enzyme[0], enzyme[0][0:enzyme[1]] + '|' + enzyme[0][enzyme[1]:]))
result = []
for seq in seqs:
for r in replacements:
seq = seq.replace(r[0], r[1]) # So AATTC becomes AATT|C
result.append(seq.split('|')) # So AATT|C becomes AATT, C
return result
def test():
seqs = ["AATTCCGGTCGGGGCTCGGGGG","AAAGCAAAATCAAAAAAGCAAAAAATC"]
enzymes = [["TC", 1],["GC",1]]
print digestfragmentwithenzyme(seqs, enzymes)
</code></pre>