用蛋白质的基因标识符检索DNA序列

2条回答

网友

1楼 · 编辑于 2024-09-21 05:54:09

您可以尝试访问SeqRecord的注释：

seq_record=SeqIO.read(handle,"gb")
nucleotide_accession = seq_record.annotations["db_source"]

在您的例子中，nucleotide_accession是“REFSEQ:accessment NM_000673.4”

现在看看您是否可以解析这些注释。只有这个测试用例：

^{pr2}$

网友

2楼 · 编辑于 2024-09-21 05:54:09

你可以利用艾琳，请求与核苷酸序列的UID相对应的蛋白质序列的UID：

from Bio import Entrez
from Bio import SeqIO
email = 'seb@free.fr'
term = 'NM_207618.2' #fro example, accession/version

### first step, we search for the nucleotide sequence of interest
h_search = Entrez.esearch(
        db='nucleotide', email=email, term=term)
record = Entrez.read(h_search)
h_search.close()

### second step, we fetch the UID of that nt sequence
handle_nt = Entrez.efetch(
        db='nucleotide', email=email, 
        id=record['IdList'][0], rettype='fasta') # here is the UID

### third and most important, we 'link' the UID of the nucleotide
# sequence to the corresponding protein from the appropriate database
results = Entrez.read(Entrez.elink(
        dbfrom='nucleotide', linkname='nucleotide_protein',
        email=email, id=record['IdList'][0]))

### last, we fetch the amino acid sequence
handle_aa = Entrez.efetch(
        db='protein', email=email, 
        id=results[0]['LinkSetDb'][0]['Link'][0]['Id'], # here is the key...
        rettype='fasta')

相关问题更多 >

编程相关推荐

热门问题

热门文章

用蛋白质的基因标识符检索DNA序列

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >