Python pygff包_程序模块 - PyPI

分析gff3文件的实用程序

pygff的Python项目详细描述

安装：

有多种安装选项

# Recommended: from conda-forge
conda install -c conda-forge pygff

# From PyPI
pip install pygff

# From GitHub
git clone https://github.com/betteridiot/pygff.git
cd pygff
python3 ./setup.py install

或者，您可能需要测试程序。一个非常小的测试套件提供。注意：pytest需要进行测试。

如果选择在当前环境中测试程序/生成：

# If installed from conda-forge or the PyPI
pytest --pyargs pygff

# Or, if installed from source
python ./setup.py test

“原样”保修：

这个计划是为了在大学里受教育而明文规定的密歇根州计算医学和生物信息学系。只是用于gff/gtf文件的特定子集：gff3文件。没有承诺是使其具有更广泛的功能。

背景：

通用特征格式（gff）是为了简洁地表示基因组特征（如外显子、内含子、基因等）。它们是以制表符分隔的9列纯文本（或gzip压缩）文件。这9列是这样描述的：

Column	Content	Description
1	seqid	The ID of the landmark used to establish the coordinate system for the current feature
2	source	The source is a free text qualifier intended to describe the algorithm or operating procedure that generated this feature
3	type	The type of the feature (previously called the "method")
4	start	The start coordinate of the feature, given in positive 1-based integer coordinates, relative to the landmark given in column one
5	end	The end coordinate of the feature, given in positive 1-based integer coordinates, relative to the landmark given in column one
6	score	The score of the feature, a floating point number
7	strand	The strand of the feature. '+' for positive strand (relative to the landmark), '-' for minus strand, and '.' for features that are not stranded
8	phase	For features of type "CDS", the phase indicates where the feature begins with reference to the reading frame
9	attributes	A list of feature attributes in the format tag=value

注意：此信息部分来自The Sequence Ontology组的github。

说明：

pygff是为了在处理gff3文件时提供有用的接口而编写的。它允许用户在懒散产生的方式。

但是，编写了一个额外的功能来生成gff3索引在飞行中。此索引允许伪随机访问。由于它是如何实现的，不建议直接从命令行运行此程序，因为这个指数本质上是短暂的。

pygff公开两个处理gff3数据的类：

`pygff.GffFile`：

Pygff包的主类

处理gff3文件的打开、迭代和关闭。两者兼得压缩和解压缩的gff3文件。

迭代时，它会懒洋洋地返回一个pygff.GffEntry对象。这些物体可以相互比较，并且可以通过编程方式访问所有特征。

args：

filename（str）：/path/to/file.gff[.gz]

{< CD8> }（^ {CD9}}）：对于索引目的，确定阈值的周期（默认值：3）

提高：

TypeError如果gff文件不是版本3

并公开pygff.GffFile.fetch(seqid, start, stop)方法：

获取给定区域内所有gff项的生成器。

也只能提取gff条目的特定类型（如果提供的话）

args：

seqid（str）：支架染色体的名称
start（int）：特征的起始位置（1索引）
end（int）：特征的结束位置（1索引）
type（str）：gff特性类型（默认：无）

产生：

（pygff.GffEntry）：来自感兴趣区域的给定gff条目

`pygff.GffEntry`：

表示单个gff项的对象。

此对象还可以执行总排序比较操作（<；，<；=，==，！=，>；=，>；）首先基于seqid，然后开始位置，最后是结束位置。

属性：

seqid（str）：支架染色体的名称
source（str）：生成功能的程序名
type（str）：特征类型
start（int）：特征的起始位置（1索引）
end（int）：特征的结束位置（1索引）
score（float）：功能的质量分数
strand（str）：要么是“+”（向前），“-”（反向），要么是“.”
phase（int）：0、1或2，表示的第一个碱基是密码子的第一个碱基
attributes（dict）：所有标记/值对的字典

快速启动

导入

importpygff

顺序迭代

withpygff.GffFile('/path/to/file.gff[.gz]')asgff:forentryingff:do_something(entry)

伪-随机访问

gff=pygff.GffFile('/path/to/file.gff[.gz]')forentryingff.fetch('chr1',123040,128040):do_something(entry)

输出

withopen('outfile.gff','wb')asoutfile:withpygff.GffFile('/path/to/file.gff[.gz]')asgff:forentryingff:# Some filteringprint(entry,file=outfile)

贡献和行为准则：

这个项目建立在开放科学、开源和开放思想的基础上。鼓励一个包容和积极的环境，请看我们的Code of Conduct。

如果您有兴趣参与该项目，请参见CONTRIBUTING指南

欢迎加入QQ群-->： 979659372

pygff 0.0.2

pygff的Python项目详细描述

安装：

“原样”保修：

背景：

说明：

`pygff.GffFile`：

`pygff.GffEntry`：

快速启动

导入

顺序迭代

伪-随机访问

输出

贡献和行为准则：

推荐PyPI第三方库

koodaamo.pas.browseronlyredirect

qualdocs

wow.armoryapi

allofplos

django-materialize-nav

verve-flake8-mock

VeritransPa

plpydbapi

xlmhg

SimpleTorrentStreaming

getgrowth

twitter.common.pants

graingert-drc

pyLDB

mountain

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

pygff 0.0.2

pygff的Python项目详细描述

安装：

“原样”保修：

背景：

说明：

pygff.GffFile：

pygff.GffEntry：

快速启动

导入

顺序迭代

伪-随机访问

输出

贡献和行为准则：

推荐PyPI第三方库

koodaamo.pas.browseronlyredirect

qualdocs

wow.armoryapi

allofplos

django-materialize-nav

verve-flake8-mock

VeritransPa

plpydbapi

xlmhg

SimpleTorrentStreaming

getgrowth

twitter.common.pants

graingert-drc

pyLDB

mountain

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

`pygff.GffFile`：

`pygff.GffEntry`：

导航栏

项目链接

标签