<p>上述代码存在多个问题。我将使用原稿,因为那是后期编辑(所以我假设它是最新的)</p>
<pre><code>dataset = "ggtcagaaaaagccctctccatgtctactcacgatacatccctgaaaaccactgaggaagtggcttttcagatcatcttgctttgccagtttggggttgggacttttgccaatgtatttctctttgtctataatttctctccaatctcgactggttctaaacagaggcccagacaagtgattttaagacacatggctgtggccaatgccttaactctcttcctcactatatttccaaacaacatgatga"
startcodon=0
n=0
while(n < 1):
startcodon=dataset.find("atg", startcodon, len(dataset)-startcodon)
#locate stop codons
taacodon=dataset.find("taa", startcodon+3, len(dataset)-startcodon)
tagcodon=dataset.find("tag", startcodon+3, len(dataset)-startcodon)
tgacodon=dataset.find("tga", startcodon+3, len(dataset)-startcodon)
</code></pre>
<p>这不是三人一组跳。这是通过字符串并确定其位置。这就是为什么无论发生什么,你都会得到相同的值</p>
<pre><code>if(taacodon<tagcodon):
if(taacodon<tgacodon):
stopcodon=taacodon
#print("taacodon", startcodon)
else:
stopcodon=tgacodon
#print("tGacodon", startcodon)
elif(tgacodon>tagcodon):
stopcodon=tagcodon
#print("taGcodon", startcodon)
else:
stopcodon=tgacodon
#print("tGacodon", startcodon)
</code></pre>
<p>我想这是为了找到第一个终止密码子。但是,如果find找不到字符串,它将返回一个值-1(因为您没有标记,所以即使它不存在,它也将始终是停止密码子)</p>
<pre><code>n=0;
while(n < len(codon)):
rcodon.append (codon[n][len(codon[n])::-1])
#replace a with u
rcodon[n] = re.sub('a', "u", rcodon[n])
#replace t with a
rcodon[n] = re.sub('t', "a", rcodon[n])
#replace c with x
rcodon[n] = re.sub('c', "x", rcodon[n])
#replace g with c
rcodon[n] = re.sub('g', "c", rcodon[n])
#replace x with g
rcodon[n] = re.sub('x', "g", rcodon[n])
print("DNA sequence: ", codon[n] ,'\n', "RNA sequence:", rcodon[n])
n=n+1
</code></pre>
<p>使用dicts和fstrings,可以更有效地清洁物品。我也不太明白为什么c到x,然后x到g</p>
<p>最后,您的数据集不包含来自第一个atg的终止密码子。所以它不能按你想要的方式转录</p>
<p>我在数据集末尾添加了一个终止密码,以获得您希望的输出,您可以这样做:</p>
<pre><code>dataset = "ggtcagaaaaagccctctccatgtctactcacgatacatccctgaaaaccactgaggaagtggcttttcagatcatcttgctttgccagtttggggttgggacttttgccaatgtatttctctttgtctataatttctctccaatctcgactggttctaaacagaggcccagacaagtgattttaagacacatggctgtggccaatgccttaactctcttcctcactatatttccaaacaacatgtaaa"
rdict={'a':'u','t':'a','c':'g','g':'c'}
start_codon=dataset.find("atg")
for nucleotides in range(start_codon+3,len(dataset),3):
if dataset[nucleotides:nucleotides+3] in {'taa','tag','tga'}:
stop_codon=nucleotides
DNA=[]
RNA=[]
for bases in range(start_codon,stop_codon,1):
DNA.append(dataset[bases])
RNA.append(rdict[dataset[bases]])
print(f"DNA Sequence: {''.join(DNA)}\nRNA Sequence: {''.join(RNA)}")
while True:
answer=input('\nplease input sequence you would like to see or exit to quit: ')
if answer == 'exit':
break
try:
print(f'DNA Sequence: {DNA[int(answer)]}\nRNA Sequence: {RNA[int(answer)]}')
except:
print('Entry invalid, please input number')
</code></pre>
<p>(实际上,您可以简化这个过程,并使用列表理解使其更加简短,但我已经写出了循环,并制作了其中的两个循环,这样您就可以得到大致的想法)</p>