Python ngstream包_程序模块 - PyPI

从SRA和GA4GH访问流式传输NGS的实用程序。

ngstream的Python项目详细描述

ngstream：从公共数据库流式NGS读取

ngstream是一个小的python（3.6+）库，它可以很容易地将NGS从Sequence Read Archive（SRA）、GA4GH和（最终）其他公共数据库（给定登录号）流化。在

依赖关系

与SRA交互需要安装NGS和python语言绑定。按照说明here。我们建议从bioconda或自制（brew install sratookkit）安装SDK，然后从GitHub安装python库。在
在BAM/CRAM（例如，使用Htsget下载）和SAM/FASTQ之间转换需要pysam。在

请注意，SRA工具箱默认缓存下载的数据——如果您神秘地耗尽了硬盘空间，这可能就是原因所在。有关如何配置/禁用缓存的说明是here。如果要更改缓存位置，请使用以下命令（它不会返回0，但仍然有效）：

vdb-config --root -s /repository/user/main/public/root=<TARGET_DIR>

安装

^{pr2}$

源代码构建

克隆此存储库并运行：

make

从SRA访问读取

importngstream# Use the API to stream reads within your own python program.withngstream.open("SRR3618567",protocol="sra")asreader:forrecordinreader:# `record` is an `ngstream.api.Record` object if the data is# single-end, and a `ngstream.api.Fragment` object if the data# is paired-end.print(record.as_fastq())

使用HTSGet访问读取

importngstreamfrompathlibimportPathurl='https://era.org/hts/ABC123'ref=ngstream.GenomeReference("GRCh37",Path("GRCh37_sizes.txt"))withngstream.open(url,protocol="htsget",reference=ref)asreader:forpairinreader:print("\n".join(str(read)forreadinpair))

将读取内容转储到一个文件（或一对文件）

importngstream# Grab 1000 read pairs from an SRA run and write them to FASTQ files.accession='SRR3618567'withngstream.open("SRR3618567",protocol="sra",item_limit=1000)asreader:files=ngstream.dump_fastq(reader)print(f"Wrote {reader.read_count} reads from {accession} to {files[0]}, {files[1]}")

使用命令行工具

# Dump all reads from the ABC123 dataset to ABC123.bam in the current directory.
$ htsget_dump https://era.org/hts/ABC123

文件

很快就要来了

开发商

我们欢迎通过拉式请求进行投稿。在
单元测试是非常理想的。在
在样式方面，我们强制使用black代码样式。请使用make reformat。在
我们使用Google风格的docstring，它由Napoleon Sphinx Plugin格式化。在
我们运行派林作为每个构建的一部分，并努力保持10/10的分数。在
我们执行Code of Conduct。在

欢迎加入QQ群-->： 979659372

ngstream 0.2.2

ngstream的Python项目详细描述

ngstream：从公共数据库流式NGS读取

依赖关系

安装

源代码构建

从SRA访问读取

使用HTSGet访问读取

将读取内容转储到一个文件（或一对文件）

使用命令行工具

文件

开发商

推荐PyPI第三方库

dynamodbgeo

CircuitSeeker

pythondaemon3k

nbdt

harmpdf

gaussian-dist

dsbox

tpList

ussd-framework

distributions-adeola

hello20191124qiguaimaomao

bgdf-distributions

baizhanAsa

ImageNetwork

molpack

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

ngstream 0.2.2

ngstream的Python项目详细描述

ngstream：从公共数据库流式NGS读取

依赖关系

安装

源代码构建

从SRA访问读取

使用HTSGet访问读取

将读取内容转储到一个文件（或一对文件）

使用命令行工具

文件

开发商

推荐PyPI第三方库

dynamodbgeo

CircuitSeeker

pythondaemon3k

nbdt

harmpdf

gaussian-dist

dsbox

tpList

ussd-framework

distributions-adeola

hello20191124qiguaimaomao

bgdf-distributions

baizhanAsa

ImageNetwork

molpack

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签