用系统发育最大简约法预测谱系特异性序列元素的得失。
mapGL的Python项目详细描述
MAPGL
H2>基于系统进化最大简约性的基因组序列元件的遗传增益和丢失预测将基因组区域标记为直系,在查询物种中获得,或在 目标物种,基于最近 共同祖先(mrca)。链接的路线文件用于映射 查询目标和一个或多个外部组物种。直接从 对目标的查询标记为Orthologs,并且 目标物种在输出中给出。非映射特征被指定为 基于最大简约算法预测存在或损益的存在性 或者没有参加核磁共振检查。
基于bnmapper.py,作者:ogert denas(james taylor lab):
- https://github.com/bxlab/bx-python/blob/master/scripts/bnMapper.py
- https://travis-ci.org/bxlab/bx-python
依赖关系
纽比 赛隆 六
用法
mapGL.py [-h] [-o FILE] [-t FLOAT] [-g GAP] [-v {info,debug,silent}] [-k] input tree qname tname alignments [alignments ...]
必需参数
Argument | Description |
---|---|
input | Input regions to process. Should be in standard bed format. Only the first four bed fields will be used. |
tree | Phylogenetic tree describing relationships of query and target species to outgroups. Must be in standard Newick format. Branch lengths are optional, and will be ignored. |
qname | Name of the query species. Regions from this species will be mapped to target species coordinates. |
tname | Name of the target species. Regions from the query species will be mapped to coordinates from this species. |
alignments | Alignment files (.chain or .pkl): One for the target species and one per outgroup species. Files should be named according to the convention: qname.tname[...].chain.gz, where qname is the query species name and tname is the name of the target/outgroup species. Names used for qname and tname must match names used in the phylogenetic tree. |
选项
Option | Description |
---|---|
-h, --help | Show help message and exit. |
-o FILE, --output FILE | Output file. (default: stdout) |
-t FLOAT, --threshold FLOAT | Mapping threshold i.e., (elem * threshold) <= mapped_elem (default: 0.0) |
-g GAP, --gap GAP | Ignore elements with an insertion/deletion of this or bigger size. (default: -1) |
-v {info,debug,silent}, --verbose {info,debug,silent} | Verbosity level (default: info) |
-d, --drop_split | Follow the bnMapper convention of silently dropping elements that span multiple chains, rather than the liftOver mapping convention for split alignments: keep elements that span multiple chains and report the longest aligned segment. This is not recommended, as it may lead to spurious gain/loss predictions for orthologous elements that happen to be split across chains due to chromosomal rearrangements, etc... (default: False) |
-i {BED,narrowPeak}, --in_format {BED,narrowPeak} | Input file format. (default: BED) |
输出
预测以制表符分隔的格式报告,前四列遵循BED4约定。在“状态”列中报告预测的进化历史(即正交、查询中的增益或目标中的损失)。最后三列包含映射(正交)元素在目标坐标中的映射位置。
Column | Description |
---|---|
chrom | Chromosome on which the query element is located. |
start | Start position on query chromosome. |
end | End position on query chromosome. |
name | Element name or ID. |
peak | Peak location (narrowPeak input) or element midpoint (BED input) |
status | Predicted phylogenetic history: ortholog, gain_qname, or loss_tname |
mapped chrom | For mapped (ortholog) elements, the chromosome on which the mapped element is located, in target coordinates. |
mapped start | For mapped (ortholog) elements, the start position on the target chromosome on which the mapped element is located. |
mapped end | For mapped (ortholog) elements, the end position on the target chromosome on which the mapped element is located. |
mapped_peak | For mapped (ortholog) elements, the mapped peak position (narrowPeak input) or mapped element midpoint (BED input). |
版权所有2018,Adam Diehl(adadiehl@umich.edu),密歇根大学博伊尔实验室