Python plncpro包_程序模块 - PyPI

PlncPRO（随机森林植物长非编码rna预测）是一个分类编码（mRNAs）和长非编码转录本（lncRNAs）的程序。

plncpro的Python项目详细描述

^{1}$ PyPI - Downloads

                      _____  _            _____  _____   ____  
                     |  __ \| |          |  __ \|  __ \ / __ \ 
                     | |__) | |_ __   ___| |__) | |__) | |  | |
                     |  ___/| | '_ \ / __|  ___/|  _  /| |  | |
                     | |    | | | | | (__| |    | | \ \| |__| |
                     |_|    |_|_| |_|\___|_|    |_|  \_\\____/

简介

PlncPRO（随机森林植物长非编码rna预测）是一个分类编码（mRNAs）和长非编码转录本（lncRNAs）的程序。我们的方法基于随机森林方法，使用蛋白质同源性搜索、基于序列和基于3-mer频率的特征。我们开发了几种植物的预测模型来预测lncRNAs。我们在植物和脊椎动物身上对我们的方法进行了全面测试，发现我们的模型比现有的工具更有效。在

引文

Singh等人，PLncPRO用于预测植物中的长非编码RNA（lncRNAs）及其在水稻和鹰嘴豆中发现非生物胁迫响应的LNCPro中的应用。核酸研究，2017年12月15日；45（22）：e183。doi:10.1093/nar/gkx866。在

注意：我们已经为python3更新了PlncPro。python2的PlncPro也可以在http://ccbb.jnu.ac.in/plncpro/获得。此新版本的用法与旧版本不同。

安装

先决条件：

操作系统：Linux、macOS
Python3.5或更高版本（http://www.python.org/）
NCBI爆炸（https://blast.ncbi.nlm.nih.gov/Blast.cgi）
GNU C库（glibc>；=2.14）

python依赖项

数字（http://www.numpy.org/）
杂乱（https://www.scipy.org/）
Scikit学习（http://scikit-learn.org/）
生物圈（http://biopython.org/）
正则表达式

使用PIP

^{pr2}$

来源

git clone https://github.com/urmi-21/PLncPRO.git
pip install PlncPro

运行测试

bash tests/local_test.sh

基本用途

有关详细的用法示例，请参见examples。在

`plncpro predict`

标记lncRNAs和mRNAs。此文件读取输入包含序列的文件，然后将序列分类为编码或非编码。它使用由构建.py分类。它输出一个包含类标签和每个类概率的文件顺序。在

plncpro predict -i <input fasta> -o <output_dir> -p <output_file_name> -t 2 -d <blast_db> -m <model_file>

参数

-p,--prediction_out	output file name
-i,--infile		file containing input sequences
-m,--model		model file
-o,--outdir		output directory name
-d,--db			path to blast database
		OPTIONAL
-t,--threads		number of threads [default: 4]
-l,--labels		path to the files containg labels(it outputs classification accuracy)
-r,--remove_temp	clean up intermediate files
-v,--verbose		show more messages
--min_len		specifiy min_length to filter input files
--noblast		Don't use blast features
-no_ff			Don't use framefinder features
--qcov_hsp		specify query coverage parameter for blast[default:30]
--blastres*		path to blast output for input file
*blast result should be in following format: -outfmt '6 qseqid sseqid pident evalue qcovs qcovhsp score bitscore qframe sframe'

`plncpro build`

使用给定的训练数据建立模型（mRNA/lncRNA转录本）。此文件读取两个带标签的数据集包含编码和非编码的转录本。然后它就变成了一个随机的基于森林的分类模型和保存模型，可以使用预测未知序列。在

plncpro build -p <mrna fasta> -n <lncrna fasta> -o <out_dir> -m <model_name> -d <blast db> -t <threads>

参数

-p,--pos		file containing mRNA sequences
-n,--neg		file containing lncRNA sequences
-m,--model		output model name
-o,--outdir		output directory name
-d			path to blast database
		OPTIONAL
-t,--threads		number of threads [default: 4]
-k,--num_trees		number of trees[default: 1000]
-r,--remove_temp	clean up intermediate files
-v,--verbose		show more messages	
--min_len		specifiy min_length to filter input files
--noblast		Don't use blast features
--no_ff			Don't use framefinder features
--qcov_hsp		specify query cov parameter for blast[default:30]
--pos_blastres*		path to blast result for mRNA input file
--neg_blastres*		path to blast result for lncRNA input file

*blast result should be in following format: -outfmt '6 qseqid sseqid pident evalue qcovs qcovhsp score bitscore qframe sframe'

plncpro prettoseq

从中提取mRNA或lncRNA序列 PLNCPRO输出文件。此文件读取预测输出文件并从给定的类中提取序列。用户可以指定类和截取概率并提取期望的转录序列。在

plncpro predtoseq -f <fasta_file> -o <outputfile> -p <PLNCPRO_prediction_file> -l <required_label>

参数

-f			input fasta file name
-o			output fasta file name	
-p			path to file containg predictions by PLNCPRO
		OPTIONAL
-l			label of the required sequences (0 for lncRNA;1 for mRNA) [default:0]
-s			class probability cutoff (extract sequences with probability greater than or equal to s
--min			specifiy min_length of sequences[default:0]
--max			specifiy min_length of sequences[default:Inf]

下载论文中使用的数据

数据托管在googledrive上。Direct link

使用wget直接下载。在

wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=108S-9Bt4CLCHTaCn6-HKTqQZDo0nssZe' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=108S-9Bt4CLCHTaCn6-HKTqQZDo0nssZe" -O plncpro_data.zip && rm -rf /tmp/cookies.txt

复制

GNU公共许可证版本3（GPLv3）关于http://www.gnu.org/copyleft/gpl.html的详细信息

欢迎加入QQ群-->： 979659372

plncpro 1.2.2

plncpro的Python项目详细描述

简介

引文

注意：我们已经为python3更新了PlncPro。python2的PlncPro也可以在http://ccbb.jnu.ac.in/plncpro/获得。此新版本的用法与旧版本不同。

安装

先决条件：

python依赖项

使用PIP

来源

运行测试

基本用途

plncpro predict

参数

plncpro build

参数

plncpro prettoseq

参数

下载论文中使用的数据

复制

推荐PyPI第三方库

pytest-optional-tests

flask-gae_blobstore

gapic-google-cloud-spanner-admin-database-v1

cubicweb-i18nfield

fh-drf-common

footmark

KonFoo

pyelectron

texture

Yuci-Dictionar

yarc

FatJSON

c2c.recipe.cssmin

wxAppBar

microhomie-node-http

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

`plncpro predict`

`plncpro build`

导航栏

项目链接

标签