Python bohra包_程序模块 - PyPI

用于分析短读illumina数据微生物学公共卫生的生物信息学管道。

bohra的Python项目详细描述

博拉

bohra是微生物基因组学管道，主要设计用于公共卫生，但也可能在研究环境中有用。管道接受一个以制表符分隔的文件作为输入，该文件的独立ID后跟到READ1和READ2的路径、对齐的引用和唯一标识符，其中的读取是Illumina对端读取（不支持其他平台）。

动机

bohra的灵感来自于nullabor（https://github.com/tseemann/nullarbor），它被用于公共卫生微生物学实验室，用于分析微生物样品的短读。管道是用Snakemake写的。

词源

“伯拉”是一种生活在努拉伯尔上的树袋鼠的名称。这个名字是为了反映这样一个事实：它将主要用于建造trees，依赖于snippy（以一只非常著名的袋鼠命名），灵感来自nullabor。

管道

bohra接受原始的顺序读取，并生成一个独立的html文件，用于简单地分发报告。

Bohra可以在三种模式下运行

单核苷酸多态性与系统发育

清除读数
呼叫变体
生成系统发育树

单核苷酸多态性、系统发育、类型、注释和物种识别（默认值）

清除读数
呼叫变体
生成系统发育树
装配
物种鉴定
MLST
抵抗性
注释

单核苷酸多态性、系统发育、泛基因组、分型和物种鉴定

清除读数
呼叫变体
生成系统发育树
装配
物种鉴定
MLST
抵抗性
注释
全基因组

安装

依赖关系

Bohra要求>；=python3.6

pip3 install bohra

一个康达食谱博拉将很快！但现在您需要在系统上安装以下依赖项

bohra可以在两种模式下运行run进行初始分析，以及rerun进行重新分析。生成一个.html报告，允许可视化树和检查数据集，以提供可能有助于解释结果的见解。

设置

输入文件

输入文件需要是一个制表符分隔的文件，具有三列isolateid、path to r1和path to r2。

Isolate-ID    /path/to/reads/R1.fq.gz    /path/to/reads/R2.fq.gz

引用

参考文献的选择对snp检测的准确性和基因组相关性的研究具有重要意义。应按照以下指南选择适当的参考文献。

来自同一ST（如适用）或金标准参考（可用于结核分枝杆菌）的封闭参考。
MDU中与查询数据集类型相同的PacBio或Nanopore程序集
一种高质量的从头开始的程序集，可以是数据集中的一个独立程序集，也可以是同一个ST或类型的一个独立程序集。

掩码

噬菌体掩蔽对于防止snps的膨胀非常重要，snps可以通过水平转移而不是垂直转移来引入。对于封闭的基因组或那些公共可用的基因组，可以使用phaster-query.pl来识别掩蔽区域。如果使用了denovo程序集，则可以使用网站phaster.ca。屏蔽区域应以.bed格式提供。

运行

minimal命令

bohra run -r path/to/reference -i path/to/inputfile -j unique_id -m path/to/maskfile (optional)

usage: bohra run [-h] [--input_file INPUT_FILE] [--job_id JOB_ID]
                 [--reference REFERENCE] [--mask MASK]
                 [--pipeline {sa,s,a,all}]
                 [--assembler {shovill,skesa,spades}] [--cpus CPUS]
                 [--minaln MINALN] [--prefillpath PREFILLPATH] [--mdu MDU]
                 [--workdir WORKDIR] [--resources RESOURCES] [--force]
                 [--dryrun] [--gubbins]

optional arguments:
  -h, --help            show this help message and exit
  --input_file INPUT_FILE, -i INPUT_FILE
                        Input file = tab-delimited with 3 columns
                        <isolatename> <path_to_read1> <path_to_read2>
                        (default: )
  --job_id JOB_ID, -j JOB_ID
                        Job ID, will be the name of the output directory
                        (default: )
  --reference REFERENCE, -r REFERENCE
                        Path to reference (.gbk or .fa) (default: )
  --mask MASK, -m MASK  Path to mask file if used (.bed) (default: False)
  --pipeline {sa,s,a,all}, -p {sa,s,a,all}
                        The pipeline to run. SNPS ('s') will call SNPs and
                        generate phylogeny, ASSEMBLIES ('a') will generate
                        assemblies and perform mlst and species identification
                        using kraken2, SNPs and ASSEMBLIES ('sa' - default)
                        will perform SNPs and ASSEMBLIES. ALL ('all') will
                        perform SNPS, ASSEMBLIES and ROARY for pan-genome
                        analysis (default: sa)
  --assembler {shovill,skesa,spades}, -a {shovill,skesa,spades}
                        Assembler to use. (default: shovill)
  --cpus CPUS, -c CPUS  Number of CPU cores to run, will define how many rules
                        are run at a time (default: 36)
  --minaln MINALN, -ma MINALN
                        Minimum percent alignment (default: 0)
  --prefillpath PREFILLPATH, -pf PREFILLPATH
                        Path to existing assemblies - in the form
                        path_to_somewhere/isolatename/contigs.fa (default:
                        None)
  --mdu MDU             If running on MDU data (default: True)
  --workdir WORKDIR, -w WORKDIR
                        Working directory, default is current directory
                        (default: /home/khhor)
  --resources RESOURCES, -s RESOURCES
                        Directory where templates are stored (default:
                        /home/khhor/dev/bohra/bohra/templates)
  --force, -f           Add if you would like to force a complete restart of
                        the pipeline. All previous logs will be lost.
                        (default: False)
  --dryrun, -n          If you would like to see a dry run of commands to be
                        executed. (default: False)
  --gubbins, -g         If you would like to run gubbins. NOT IN USE YET -
                        PLEASE DO NOT USE (default: False)

重新运行

如果需要更改引用和/或掩码文件，则可以执行重新运行。此外，如果需要隔离删除或添加到分析中。在重新运行时应出现以下行为；

新的参考将导致在分析的所有分离物中调用snp
如果参考值不变，则仅对新分离株调用snps
每次重新运行都将确定核心线、距离和树的生成

-r和-m只有在与上一次运行不同时才需要。否则，bohra将检测并使用以前的引用和掩码文件。还应更改原始运行中使用的输入文件中包含的隔离。可以将新的隔离物添加到输入文件的底部，并用^ {CD10}}预先隔离一个隔离物，将其从分析中移除。

minimal命令

bohra rerun

                   [--workdir WORKDIR] [--resources RESOURCES] [--dryrun]
                   [--gubbins] [--keep]

optional arguments:
  -h, --help            show this help message and exit
  --reference REFERENCE, -r REFERENCE
                        Path to reference (.gbk or .fa) (default: )
  --mask MASK, -m MASK  Path to mask file if used (.bed) (default: )
  --cpus CPUS, -c CPUS  Number of CPU cores to run, will define how many rules
                        are run at a time (default: 36)
  --workdir WORKDIR, -w WORKDIR
                        Working directory, default is current directory
                        (default: /home/khhor)
  --resources RESOURCES, -s RESOURCES
                        Directory where templates are stored (default:
                        /home/khhor/dev/bohra/bohra/templates)
  --dryrun, -n          If you would like to see a dry run of commands to be
                        executed. (default: False)
  --gubbins, -g         If you would like to run gubbins. NOT IN USE YET -
                        PLEASE DO NOT USE (default: False)
  --keep, -k            Keep report from previous run (default: False)```

欢迎加入QQ群-->： 979659372

bohra 1.0.20

bohra的Python项目详细描述

博拉

动机

词源

管道

安装

依赖关系

设置

运行

重新运行

推荐PyPI第三方库

tinynpydb

repo-test-slemasne

sluggard

flocklab-tools

metapic

testflows.recipes

EXOSIMS

ialab-core

monk-pytorch-cuda92-test

itemset-mining

corae

weboob-qt

u-calc

lovely-json

robotframework_germaniumlibrar

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

bohra 1.0.20

bohra的Python项目详细描述

博拉

动机

词源

管道

安装

依赖关系

设置

运行

重新运行

推荐PyPI第三方库

tinynpydb

repo-test-slemasne

sluggard

flocklab-tools

metapic

testflows.recipes

EXOSIMS

ialab-core

monk-pytorch-cuda92-test

itemset-mining

corae

weboob-qt

u-calc

lovely-json

robotframework_germaniumlibrar

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签