半成品推理仿真工具。为了描述侧化的特征,将特征映射到物种树上
heist-hemiplas的Python项目详细描述
抢劫
HemiplasyI参考S模拟Tool
作者:
马特·吉布森(gibsomat@indiana.edu)
马克·希宾斯(mhibbins@indiana.edu)
依赖项:
安装
git clone https://github.com/mhibbins/heist
cd heist
python setup.py install
使用
^{pr2}$输入文件
输入文件是修改的NEXUS格式。一个最小的例子包括一棵树(newick格式)和至少两个用set derived
命令保存的分类单元集。如果用set outgroup
指定了一个outgroup,则树将被修剪为只包含与模拟相关的分类单元(即,包含派生分类单元的子类)+outgroup。在
#NEXUS
begin trees;
tree tree_1 = (spA:0.002,(spB:0.001,((spC:0.0004,spD:0.0008)10.0:0.0005,(spE:0.0006,spF:0.0004)8.0:0.0004)15:0.0009)90.0:0.005);
end;
begin hemiplasytool;
set derived taxon=spB
set derived taxon=spD
set derived taxon=spF
end;
树种树
纽克格式的树种树。分支长度必须为每个站点的平均替换量,并且分支必须用一致性因子标记。IQTree可用于执行此操作。在
If your tree is already ultrametric and in coalescent units, you can supply this directly if you add the flag
set type coal
to the input file.
特征
使用
set derived taxon="species in tree"
渗入
导入事件可以用
set introgression source="species in tree" dest="species in tree" prob=[float_value] timing=[float_value]
注意,必须以聚结单位指定时间。因此,我们建议首先通过^{
示例:
python -m hemiplasytool -n 100000 -x 5 -p ~/msdir/ms -g ~/Seq-Gen-1.3.4/seq-gen -o test_w_introgression -v test/input_test_small_intro_v2.txt
输出:
将生成三个输出文件。主要输出test_w_introgression.txt
,一个基因树文件test_w_introgression.trees
,包含所有观察到的拓扑结构,以及突变分布图test_w_introgression.dist.png
。在
### INPUT SUMMARY ###
Integer Code Taxon Name
1: sp1
2: sp2
3: sp3
4: sp4
5: sp5
6: sp6
The species tree (smoothed, in coalescent units) is:
(1:2.78984,(2:2.09238,((3:0.69746,4:0.69746)1:0.69746,(5:0.69746,6:0.69746)1:0.69746)1:0.69746)1:0.69746);
_________________________________ 1
|
_| _________________________ 2*
| |
|_______| ________ 3
| _______|
| | |________ 4*
|________|
| ________ 5
|_______|
|________ 6*
3 taxa have the derived state: 2, 4, 6
With homoplasy only, 3 mutations are required to explain this trait pattern (Fitch parsimony)
Introgression from taxon 4 into taxon 6 occurs at time 0.3 with probability 0.05
5.00e+05 simulations performed
### RESULTS ###
70 loci matched the species character states
"True" hemiplasy (1 mutation) occurs 14 time(s)
Combinations of hemiplasy and homoplasy (1 < # mutations < 3) occur 30 time(s)
"True" homoplasy (>= 3 mutations) occurs 26 time(s)
70 loci have a discordant gene tree
0 loci are concordant with the species tree
4 loci originate from an introgressed history
66 loci originate from the species history
Distribution of mutation counts:
# Mutations # Trees
On all trees:
1 14
2 30
3 25
4 1
On concordant trees:
# Mutations # Trees
On discordant trees:
# Mutations # Trees
1 14
2 30
3 25
4 1
Origins of mutations leading to observed character states for hemiplasy + homoplasy cases:
Tip mutation Internal branch mutation Tip reversal
Taxa 2 3 27 0
Taxa 4 3 27 0
Taxa 6 0 30 0
### OBSERVED GENE TREES ###
_________________ 4
______________|
| | ________________ 3
| ||
| | _______ 5
_| |________|
| |_______ 6*
|
| _____________________________ 1
|___|
|_____________________________ 2
This topology occured 1 time(s)
____________ 3
____________________|
| |____________ 5
_|
| ________________________ 1
|________|
| __________________ 2
|____|
| __________ 4
|_______|
|__________ 6*
This topology occured 4 time(s)
_________________________ 2
______|
| | __________________ 5
| |______|
| | _______________ 4
_| |__|
| |_______________ 6*
|
| _______________________________ 1
|_|
|_______________________________ 3
...
有关完整输出,请参见test_w_introgression.txt.txt
。在
子模块
一旦安装,两个附加程序将在命令行可用:newick2ms
和{
新ick2ms
usage: newick2ms [-h] input
Tool for converting a newick string to ms-style splits. Note that this only
makes sense if the input tree is in coalescent units.
positional arguments:
input Input newick string file
optional arguments:
-h, --help show this help message and exit
亚粘土
usage: subs2coal [-h] input
Tool for converting a newick string with branch lengths in subs/site to a
neewick string with branch lengths in coalescent units. Input requires gene or
site-concordancee factors as branch labels
positional arguments:
input Input newick string file
optional arguments:
-h, --help show this help message and exit
海斯麦奇
usage: heistmerge [-h] [-d] [inputs [inputs ...]]
Merge output files from multiple HeiST runs. Useful for simulating large trees
by running multiple batch jobs.
positional arguments:
inputs Prefixes of output files to merge or a directory (supply -d flag
as well)
optional arguments:
-h, --help show this help message and exit
-d Merge all files in a directory
heistMerge
将把合并的输出摘要写入标准输出,并创建一个新文件merged_trees.trees
,其中包含所有观察到的焦点基因树。在
- 项目
标签: