Python gff3包_程序模块 - PyPI

操作基因组特征并验证gff3文件的语法和引用序列。

gff3的Python项目详细描述

https://travis-ci.org/hsiaoyi0504/gff3-py.svg?branch=master

操作基因组特征并验证^{tt1}$文件的语法和引用序列。

自由软件：NAL公共域许可证
文档：https://gff3-py.readthedocs.org。

功能

简单数据结构：将^{tt1}$文件解析为由简单python^{tt3}$和^{tt4}$组成的结构。
validation：在解析时验证^{tt1}$语法，并将错误消息保存在解析的结构中。
best effort parsing：尽管检测到任何错误，但请继续分析整个文件，并尽可能使其更有意义。
使用python ^{tt6}$库记录错误消息，并支持自定义记录器。
分析嵌入的或外部的^{tt7}$序列以检查Ns的界限和数目。
检查并更正CDS功能的阶段。
树遍历方法ancestors和descendants按广度优先搜索顺序返回一个简单的list。
使用adopt和adopted方法传输子级和父级。
使用overlap方法测试重叠的功能。
使用remove方法删除功能及其关联的功能。
使用writemthod将修改后的结构写入gff3文件。

快速启动

一个只解析名为annotations.gff的gff3文件并验证它的示例使用名为annotations.fa的外部fasta文件看起来像：

# validate.py# ============fromgff3importGff3# initialize a Gff3 objectgff=Gff3()# parse GFF3 file and do syntax checking, this populates gff.lines and gff.features# if an embedded ##FASTA directive is found, parse the sequences into gff.fasta_embeddedgff.parse('annotations.gff')# parse the external FASTA file into gff.fasta_externalgff.parse_fasta_external('annotations.fa')# Check seqid, bounds and the number of Ns in each feature using one or more reference sourcesgff.check_reference(allowed_num_of_n=0,feature_types=['CDS'])# Checks whether child features are within the coordinate boundaries of parent featuresgff.check_parent_boundary()# Calculates the correct phase and checks if it matches the given phase for CDS featuresgff.check_phase()

一个功能更完善的gff3验证器，它有一个命令行接口，还可以生成验证降价报告在examples/gff_valid.py

下提供

下面的示例演示如何筛选、转换和修改已解析的gff3lines列表。

将类型为exon的功能更改为pseudogenic_exon，如果功能具有类型为pseudogene的祖先，则将类型为transcript的功能更改为pseudogenic_transcript。
如果pseudogene功能与gene功能重叠，请将所有子功能从pseudogene功能移到gene功能，然后删除pseudogene功能。

# fix_pseudogene.py# =================fromgff3importGff3gff=Gff3('annotations.gff')type_map={'exon':'pseudogenic_exon','transcript':'pseudogenic_transcript'}pseudogenes=[lineforlineingff.linesifline['line_type']=='feature'andline['type']=='pseudogene']forpseudogeneinpseudogenes:# convert typesforlineingff.descendants(pseudogene):ifline['type']intype_map:line['type']=type_map[line['type']]# find overlapping geneoverlapping_genes=[lineforlineingff.linesifline['line_type']=='feature'andline['type']=='gene'andgff.overlap(line,pseudogene)]ifoverlapping_genes:# move pseudogene children to overlapping genegff.adopt(pseudogene,overlapping_genes[0])# remove pseudogenegff.remove(pseudogene)gff.write('annotations_fixed.gff')

历史记录

1.0.0（2018-12-01）

修复python3问题
添加序列函数：补码（seq）和翻译（seq）
增加了fasta写入功能：fasta dict_to_file（fasta_dict，fasta_file，line_char_limit=none）
添加gff方法返回行数据序列：sequence（self，行数据，child_type=none，reference=none）
当整个基因被标记为已移除时，write不再打印redundent‘’

0.3.0（2015-03-10）

固定相位检查。

0.2.0（2015-01-28）

支持python 2.6、2.7、3.3、3.4、pypy。
不要将空属性报告为错误。
改进文件。

0.1.0（2014-12-11）

pypi上的第一个版本。

欢迎加入QQ群-->： 979659372

gff3 1.0.0

gff3的Python项目详细描述

功能

快速启动

历史记录

1.0.0（2018-12-01）

0.3.0（2015-03-10）

0.2.0（2015-01-28）

0.1.0（2014-12-11）

推荐PyPI第三方库

neso

PloneIISApp

LabSwarmManagement

ContactGrabber

odoo8-addon-sale-reason-to-export

odoo10-addon-base-partner-merge

odoo8-addon-website-no-crawler

magictree

DFO-LS

dscyd

githubcit

parse-landsat-xml

affine6p-cstest

biscuits

tectonic

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

gff3 1.0.0

gff3的Python项目详细描述

功能

快速启动

历史记录

1.0.0（2018-12-01）

0.3.0（2015-03-10）

0.2.0（2015-01-28）

0.1.0（2014-12-11）

推荐PyPI第三方库

neso

PloneIISApp

LabSwarmManagement

ContactGrabber

odoo8-addon-sale-reason-to-export

odoo10-addon-base-partner-merge

odoo8-addon-website-no-crawler

magictree

DFO-LS

dscyd

githubcit

parse-landsat-xml

affine6p-cstest

biscuits

tectonic

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签