生物信息学：找到给定基因组串的基因

Enter a genome string: TTATGTTTTAAGGATGGGGCGTTAGTT Traceback (most recent call last): File "D:\Python\Chapter 8\Bioinformatics.py", line 40, in <module> main() File "D:\Python\Chapter 8\Bioinformatics.py", line 38, in main print(findGene(geneinput)) File "D:\Python\Chapter 8\Bioinformatics.py", line 25, in findGene final += (chr[i+i + 3] + "\n") IndexError: string index out of range

1条回答

网友

1楼 · 发布于 2024-06-02 11:51:09

这可以通过regular expression完成：

import re

pattern = re.compile(r'ATG((?:[ACTG]{3})+?)(?:TAG|TAA|TGA)')
pattern.findall('TTATGTTTTAAGGATGGGGCGTTAGTT')
pattern.findall('TGTGTGTATAT')

输出

^{pr2}$

解释摘自https://regex101.com/r/yI4tN9/3

"ATG((?:[ACTG]{3})+?)(?:TAG|TAA|TGA)"g
    ATG matches the characters ATG literally (case sensitive)
    1st Capturing group ((?:[ACTG]{3})+?)
        (?:[ACTG]{3})+? Non-capturing group
            Quantifier: +? Between one and unlimited times, as few times as possible, expanding as needed [lazy]
            [ACTG]{3} match a single character present in the list below
                Quantifier: {3} Exactly 3 times
                ACTG a single character in the list ACTG literally (case sensitive)
    (?:TAG|TAA|TGA) Non-capturing group
        1st Alternative: TAG
            TAG matches the characters TAG literally (case sensitive)
        2nd Alternative: TAA
            TAA matches the characters TAA literally (case sensitive)
        3rd Alternative: TGA
            TGA matches the characters TGA literally (case sensitive)
    g modifier: global. All matches (don't return on first match)

相关问题更多 >

编程相关推荐

热门问题

热门文章

生物信息学：找到给定基因组串的基因

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >