半成品推理仿真工具。为了描述侧化的特征,将特征映射到物种树上

heist-hemiplas的Python项目详细描述


抢劫

HemiplasyI参考S模拟Tool

作者:

马特·吉布森(gibsomat@indiana.edu
马克·希宾斯(mhibbins@indiana.edu

依赖项:

  • ms
  • seq-gen
  • 生物赛顿
  • numpy公司
  • matplotlib库
  • ete3

安装

git clone https://github.com/mhibbins/heist
cd heist
python setup.py install

使用

^{pr2}$

输入文件

输入文件是修改的NEXUS格式。一个最小的例子包括一棵树(newick格式)和至少两个用set derived命令保存的分类单元集。如果用set outgroup指定了一个outgroup,则树将被修剪为只包含与模拟相关的分类单元(即,包含派生分类单元的子类)+outgroup。在

#NEXUS
begin trees;
	tree tree_1 = (spA:0.002,(spB:0.001,((spC:0.0004,spD:0.0008)10.0:0.0005,(spE:0.0006,spF:0.0004)8.0:0.0004)15:0.0009)90.0:0.005);
end;

begin hemiplasytool;
set derived taxon=spB
set derived taxon=spD
set derived taxon=spF
end;

树种树

纽克格式的树种树。分支长度必须为每个站点的平均替换量,并且分支必须用一致性因子标记。IQTree可用于执行此操作。在

If your tree is already ultrametric and in coalescent units, you can supply this directly if you add the flag set type coal to the input file.

特征

使用

set derived taxon="species in tree"

渗入

导入事件可以用

set introgression source="species in tree" dest="species in tree" prob=[float_value] timing=[float_value]

注意,必须以聚结单位指定时间。因此,我们建议首先通过^{}运行输入树

示例:

python -m hemiplasytool -n 100000 -x 5 -p ~/msdir/ms -g ~/Seq-Gen-1.3.4/seq-gen -o test_w_introgression -v test/input_test_small_intro_v2.txt

输出:

将生成三个输出文件。主要输出test_w_introgression.txt,一个基因树文件test_w_introgression.trees,包含所有观察到的拓扑结构,以及突变分布图test_w_introgression.dist.png。在

### INPUT SUMMARY ###

Integer Code	Taxon Name
1:	sp1
2:	sp2
3:	sp3
4:	sp4
5:	sp5
6:	sp6

The species tree (smoothed, in coalescent units) is:
 (1:2.78984,(2:2.09238,((3:0.69746,4:0.69746)1:0.69746,(5:0.69746,6:0.69746)1:0.69746)1:0.69746)1:0.69746);

  _________________________________ 1
 |
_|        _________________________ 2*
 |       |
 |_______|                 ________ 3
         |         _______|
         |        |       |________ 4*
         |________|
                  |        ________ 5
                  |_______|
                          |________ 6*

3 taxa have the derived state: 2, 4, 6

With homoplasy only, 3 mutations are required to explain this trait pattern (Fitch parsimony)

Introgression from taxon 4 into taxon 6 occurs at time 0.3 with probability 0.05

5.00e+05 simulations performed

### RESULTS ###

70 loci matched the species character states

"True" hemiplasy (1 mutation) occurs 14 time(s)

Combinations of hemiplasy and homoplasy (1 < # mutations < 3) occur 30 time(s)

"True" homoplasy (>= 3 mutations) occurs 26 time(s)

70 loci have a discordant gene tree
0 loci are concordant with the species tree

4 loci originate from an introgressed history
66 loci originate from the species history

Distribution of mutation counts:

# Mutations	# Trees
On all trees:
1		14
2		30
3		25
4		1

On concordant trees:
# Mutations	# Trees

On discordant trees:
# Mutations	# Trees
1		14
2		30
3		25
4		1

Origins of mutations leading to observed character states for hemiplasy + homoplasy cases:

	Tip mutation	Internal branch mutation	Tip reversal
Taxa 2	3	27	0
Taxa 4	3	27	0
Taxa 6	0	30	0

### OBSERVED GENE TREES ###

                 _________________ 4
  ______________|
 |              | ________________ 3
 |              ||
 |               |         _______ 5
_|               |________|
 |                        |_______ 6*
 |
 |    _____________________________ 1
 |___|
     |_____________________________ 2

This topology occured 1 time(s)
                       ____________ 3
  ____________________|
 |                    |____________ 5
_|
 |         ________________________ 1
 |________|
          |     __________________ 2
          |____|
               |        __________ 4
               |_______|
                       |__________ 6*

This topology occured 4 time(s)
         _________________________ 2
  ______|
 |      |       __________________ 5
 |      |______|
 |             |   _______________ 4
_|             |__|
 |                |_______________ 6*
 |
 |  _______________________________ 1
 |_|
   |_______________________________ 3

...

有关完整输出,请参见test_w_introgression.txt.txt。在

Mutation distribution

子模块

一旦安装,两个附加程序将在命令行可用:newick2ms和{}。在

新ick2ms

usage: newick2ms [-h] input

Tool for converting a newick string to ms-style splits. Note that this only
makes sense if the input tree is in coalescent units.

positional arguments:
  input       Input newick string file

optional arguments:
  -h, --help  show this help message and exit

亚粘土

usage: subs2coal [-h] input

Tool for converting a newick string with branch lengths in subs/site to a
neewick string with branch lengths in coalescent units. Input requires gene or
site-concordancee factors as branch labels

positional arguments:
  input       Input newick string file

optional arguments:
  -h, --help  show this help message and exit

海斯麦奇

usage: heistmerge [-h] [-d] [inputs [inputs ...]]

Merge output files from multiple HeiST runs. Useful for simulating large trees
by running multiple batch jobs.

positional arguments:
  inputs      Prefixes of output files to merge or a directory (supply -d flag
              as well)

optional arguments:
  -h, --help  show this help message and exit
  -d          Merge all files in a directory

heistMerge将把合并的输出摘要写入标准输出,并创建一个新文件merged_trees.trees,其中包含所有观察到的焦点基因树。在

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java无法将自定义数据类型转换为字符串?   JavaLog4j和appender,这个Log4j定义正确吗?   用于换行的java Android Eclipse拆分   与某个方法关联的java启用/禁用JButton   java小部件列表视图加载视图   java国家/地区名称中的正则表达式   从Java调用Kotlin时,如何获取错误的行号?   java将视图传递给AsyncTask以访问findViewById   java SQL性能:多个绑定还是绑定到一个SQL变量以供重用?   BluetoothAdapter上的安卓 Java NullPointerException。isEnabled()   在clojure中取消引用java方法   JAVA网SocketException:IP_添加_成员身份失败(硬件筛选器不足?)   java从类对象的方法接收nullpointer异常   java使用for循环创建多个对象   java无法使用NTLM身份验证apache camel cxf   java Eclipse不喜欢@Override注释   java Spark SQL模拟红移(Oracle)“系统日期”或“当前日期”