binrorry:一个灵活的工具,用于分类和过滤顺序读取

binlorr的Python项目详细描述


货车

BinLorry是一种灵活的工具,用于将排序和过滤顺序读入不同的文件。读取可以通过编码在头文件中、记录在csv文件中或按长度的任何属性进行分类和过滤。

安装

只需使用pip安装即可:

pip3 install binlorry

运行:

binlorry --help

从存储库安装

克隆存储库:

git clone https://github.com/rambaut/binlorry.git

安装:

pip3 install ./binlorry

无需安装即可运行

还可以直接从存储库克隆运行binrorry,而无需安装:

git clone https://github.com/rambaut/binlorry.git
python binlorry/binlorry-runner.py -h

但是,请确保在使用前安装了pandas包。

快速使用示例

binlorry -i reads/ -o barcode --bin-by barcode --filter-by barcode BC01 BC02 -n 550 -x 750

这将读取目录reads,bin中头字段barcode旁边的所有fastq或fasta文件,但前提是这是BC01BC02,并且长度在550到750个核苷酸之间。 它将使用文件名前缀barcode,结果是文件:barcode_BC01.fastqbarcode_BC02.fastq

binlorry -i my_file.fastq -t my_file.csv --out-report -o filtered --filter-by reference Type_1 -n 550 -x 750

上面的示例将接收来自my_file.fastq和csv报告my_file.csv的读取。假设my_file.csv至少具有如下所示的结构,并且csv中的读取名与输入读取文件中的读取名匹配,binrorry将筛选读取并仅输出长度介于550到750个基之间的具有类型1引用的读取。

read_namereference
f66db89e-de96-4fa7-813a-6c5a89586100Type_1
a39069c5-c493-45f8-9fa8-49eccb5c1807Type_1
868efa99-f4c1-4a68-87a9-196a44b997e0Type_2
binlorry -i path/to/my_fastq_dir -t path/to/my_csv_dir \
--out-report -o path/to/binned/barcode \
--filter-by barcode BC01 --bin-by barcode -n 1000 -x 2000

假设csv目录中有对应于fastq目录中读取文件的报告,binrorry将递归地搜索这两个目录,并基于文件名词干匹配csv和fastq文件。然后,此命令将筛选只包含bc01的读取,并输出与输出fastq文件中显示的读取相对应的csv报告。

命令行界面

usage: binlorry -i INPUT [-t CSV_FILE] -o OUTPUT [-v VERBOSITY]
                         [--bin-by FIELD [FIELD ...]]
                         [--filter-by FILTER [FILTER ...]] [-n MIN] [-x MAX]
                         [-h] [--version]

Main options:
  -i INPUT, --input INPUT
                          FASTA/FASTQ of input reads or a directory which will
                          be recursively searched for FASTQ files (required)
  -t INPUT_CSV, --index-table INPUT_CSV
                           A CSV file with metadata fields for reads (otherwise these are assumed
                           to be in the read headers). This can also include a file and line number to improve performance. Assumes read name is first column of the csv.'
  -o OUTPUT, --output OUTPUT
                          Output filename (or filename prefix)
  -r REPORT, --out-report REPORT
                          Output a subsetted csv report along with the fastq. (Default: False)
                          Only implemented for use in conjunction with -t option.
  -f FORCE_OUTFILES, --force-output FORCE_OUTFILES
                          Output binned/ filtered files even if empty. (default: False)
                          Usage: only a single binning factor with a corresponding filter factor.
  -v VERBOSITY, --verbosity VERBOSITY
                          Level of progress information: 0 = none, 1 = some, 2
                          = lots, 3 = full - output will go to stdout if reads
                          are saved to a file and stderr if reads are printed
                          to stdout (default: 1)

Binning/Filtering options:
  --bin-by FIELD [FIELD ...]
                          Specify header field(s) to bin the reads by. For
                          multiple fields these will be nested in order
                          specified.
  --filter-by FILTER [FILTER ...]
                          Specify header field and accepted values to filter
                          the reads by. Multiple filter-by options can be
                          specified.
  -n MIN, --min-length MIN
                          Filter the reads by their length, specifying the
                          minimum length.
  -x MAX, --max-length MAX
                          Filter the reads by their length, specifying the
                          maximum length.

Help:
  -h, --help              Show this help message and exit
  --version               Show program's version number and exit

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java原子式更新2个长值   从WSDL生成java代码   java如何将web外部化。XMLServletInitParam?Spring DelegatingFilterProxy用于servlet?   使用JSoup从网页读取元素时发生java Getting 503错误   java如何比较数组列表中存储的两种基本整数类型?   安卓连接到java主机名中的https服务时,其中是否包含“\”?   java访问接口中定义的注释,在实现它的类中?   java如何在Android中将数字放在ListView项之前   Java中12factor的spring引导管理过程   java使用MockMvc使用删除方法rest api从存储库测试deleteAll()   用单个文件编写的java应用程序引发NullPointerException   eclipse java。util。NoTouchElementException错误?   java从mapbox复制了一个缺少变量的教程,但它不会运行   java动态更改方法的返回类型   使用LibGdx在Java中使用opengl奇怪的CPU