当我试着运行这段代码时,它永远不会结束,我想它被卡在了某个地方,但我不太确定,因为我是python新手
import re
codon = []
rcodon = []
dataset = "ggtcagaaaaagccctctccatgtctactcacgatacatccctgaaaaccactgaggaagtggcttttcagatcatcttgctttgccagtttggggttgggacttttgccaatgtatttctctttgtctataatttctctccaatctcgactggttctaaacagaggcccagacaagtgattttaagacacatggctgtggccaatgccttaactctcttcctcactatatttccaaacaacatgatga"
startcodon=0
n=0
print ("DNA sequence: ", dataset)
def find_codon(codon, string, start):
i = start + 3
while i < len(string):
i = string.find(codon, i) # find the next substring
if (i - start) % 3 == 0: # check that it's a multiple of 3 after start
return i
return None
while(n < 1):
startcodon=dataset.find("atg", startcodon)
#locate stop codons
taacodon=find_codon("taa", dataset, startcodon)
tagcodon=find_codon("tag", dataset, startcodon)
tgacodon=find_codon("tga", dataset, startcodon)
stopcodon = min(taacodon, tagcodon, tgacodon)
codon.append(dataset[startcodon:stopcodon+3])
if(startcodon > len(dataset) or startcodon < 0):
n = 2;
startcodon=stopcodon
#reverse the string and swap the letters
n=0;
while(n < len(codon)):
rcodon.append (codon[n][len(codon[n])::-1])
#replace a with u
rcodon[n] = re.sub('a', "u", rcodon[n])
#replace t with a
rcodon[n] = re.sub('t', "a", rcodon[n])
#replace c with x
rcodon[n] = re.sub('c', "x", rcodon[n])
#replace g with c
rcodon[n] = re.sub('g', "c", rcodon[n])
#replace x with g
rcodon[n] = re.sub('x', "g", rcodon[n])
print("DNA sequence: ", codon[n] ,'\n', "RNA sequence:", rcodon[n])
n=n+1
answer = 0
print("Total Sequences: ", len(codon)-3)
while (int(answer) >=0):
#str = "Please enter an integer from 0 to " + str(len(dataset)) + " or -1 to quit: "
answer = int(input("Please enter a sequence you would like to see or -1 to quit: "))
if(int(answer) >= 0):
print("DNA sequence: ", codon[int(answer)] ,'\n', "RNA sequence:", rcodon[int(answer)])
任何建议都会有帮助
这是一个关于转录DNA的项目没有生物细胞 目标是:创建一个程序,可以定位DNA序列中的“atg”,然后找到停止序列(tga、taa或tag),同时从初始atg开始以三个为单位计数
编辑: 我想让程序给我atg和终止密码子之间的序列,就像我的原始代码一样。但是,我的原始代码没有考虑从ATG移动3,以找到正确的停止序列。
我的原始代码:
import re
codon = []
rcodon = []
dataset = "ggtcagaaaaagccctctccatgtctactcacgatacatccctgaaaaccactgaggaagtggcttttcagatcatcttgctttgccagtttggggttgggacttttgccaatgtatttctctttgtctataatttctctccaatctcgactggttctaaacagaggcccagacaagtgattttaagacacatggctgtggccaatgccttaactctcttcctcactatatttccaaacaacatgatga"
startcodon=0
n=0
while(n < 1):
startcodon=dataset.find("atg", startcodon, len(dataset)-startcodon)
#locate stop codons
taacodon=dataset.find("taa", startcodon+3, len(dataset)-startcodon)
tagcodon=dataset.find("tag", startcodon+3, len(dataset)-startcodon)
tgacodon=dataset.find("tga", startcodon+3, len(dataset)-startcodon)
if(taacodon<tagcodon):
if(taacodon<tgacodon):
stopcodon=taacodon
#print("taacodon", startcodon)
else:
stopcodon=tgacodon
#print("tGacodon", startcodon)
elif(tgacodon>tagcodon):
stopcodon=tagcodon
#print("taGcodon", startcodon)
else:
stopcodon=tgacodon
#print("tGacodon", startcodon)
#to add sequences to an array
codon.append(dataset[startcodon:stopcodon+3])
if(startcodon > len(dataset) or startcodon < 0):
n = 2;
startcodon=stopcodon
#reverse the string and swap the letters
n=0;
while(n < len(codon)):
rcodon.append (codon[n][len(codon[n])::-1])
#replace a with u
rcodon[n] = re.sub('a', "u", rcodon[n])
#replace t with a
rcodon[n] = re.sub('t', "a", rcodon[n])
#replace c with x
rcodon[n] = re.sub('c', "x", rcodon[n])
#replace g with c
rcodon[n] = re.sub('g', "c", rcodon[n])
#replace x with g
rcodon[n] = re.sub('x', "g", rcodon[n])
print("DNA sequence: ", codon[n] ,'\n', "RNA sequence:", rcodon[n])
n=n+1
answer = 0
print("Total Sequences: ", len(codon)-3)
while (int(answer) >= 0):
#str = "Please enter an integer from 0 to " + str(len(dataset)) + " or -1 to quit: "
answer = int(input("Please enter an sequence you would like to see or -1 to quit: "))
if(int(answer) >= 0):
print("DNA sequence: ", codon[int(answer)] ,'\n', "RNA sequence:", rcodon[int(answer)])
上述代码存在多个问题。我将使用原稿,因为那是后期编辑(所以我假设它是最新的)
这不是三人一组跳。这是通过字符串并确定其位置。这就是为什么无论发生什么,你都会得到相同的值
我想这是为了找到第一个终止密码子。但是,如果find找不到字符串,它将返回一个值-1(因为您没有标记,所以即使它不存在,它也将始终是停止密码子)
使用dicts和fstrings,可以更有效地清洁物品。我也不太明白为什么c到x,然后x到g
最后,您的数据集不包含来自第一个atg的终止密码子。所以它不能按你想要的方式转录
我在数据集末尾添加了一个终止密码,以获得您希望的输出,您可以这样做:
(实际上,您可以简化这个过程,并使用列表理解使其更加简短,但我已经写出了循环,并制作了其中的两个循环,这样您就可以得到大致的想法)
关于无止境循环,您面临的问题是由于您的函数注意到,一旦您找到一个可能的
i
并且它不是3的倍数,您应该向它添加3,否则i = string.find(codon, i)
将返回相同的i
值,更正应该是:然后,在使用
min
和None
值时会出现问题,并出现以下错误:您应该将返回值设置为一些大的数字,这将指示未找到任何内容,而不是
None
相关问题 更多 >
编程相关推荐