<p>首先,请阅读<a href="https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples">How to make good reproducible pandas examples</a>的答案。今后请提供<a href="https://stackoverflow.com/help/minimal-reproducible-example">minimal reproducibl example</a></p>
<p>其次,<a href="https://stackoverflow.com/users/10637327/michael-g">Michael G</a>绝对正确,phylip是一种语法非常独特的格式</p>
<p>下面的代码将允许您从数据框架生成一个系统发育树</p>
<p>首先进行一些导入,让我们重新创建数据帧</p>
<pre><code>import pandas as pd
from Bio import Phylo
from Bio.Phylo.TreeConstruction import DistanceCalculator
from Bio.Phylo.TreeConstruction import DistanceTreeConstructor
from Bio import AlignIO
data = {'ESV' : ['esv1', 'esv2', 'esv3'],
'trunc': ['TACGTAGGTGGCGAGCGTTATCCGGAATTATTGGGCGTAAAGCGCGCGCAGGCGGTTTCTTAAGTCTGATGTGAAAGCCCACGGCTCAACCGTGGAGGGTCATTGGAAACTGGGAAACTTGAGTGCAGAAGAGGAAAGCGGAATTCCACGTGTAGCGGTGAAATGCGTAGAGATGTGGAGGAACACCAGTGGCGAAGGCGGCTTTCTGGTCTGTAACTGACGCTGAGGCGCGAAAGCGTGGGGAGCAAACAGG',
'TACGGAGGGTGCAAGCGTTATCCGGATTCACTGGGTTTAAAGGGTGCGTAGGTGGGTTGGTAAGTCAGTGGTGAAATCTCCGAGCTTAACTTGGAAACTGCCATTGATACTATTAATCTTGAATATTGTGGAGGTTAGCGGAATATGTCATGTAGCGGTGAAATGCTTAGAGATGACATAGAACACCAATTGCGAAGGCAGCTGGCTACACATATATTGACACTGAGGCACGAAAGCGTGGGGATCAAACAGG',
'TACGGGGGGGGCAAGCGTTGTTCGGAATTACTGGGCGTAAAGGGCGCGTAGGCGGCCAGACCAAGTCGAGTGTGAAATTGCAGGGCTTAACTTTGCAGGGTCGCTCGATACTGGTCGGCTAGAGTGTGGAAGAGGGTACTGGAATTCCCGGTGTAGCGGTGAAATGCGTAGATATCGGGAGGAACACCAGCGGCGAAGGCGGGTACCTGGGCCAACACTGACGCTGAGGCGCGAAAGCTAGGGGAGCAAACAG']
}
ndf = pd.DataFrame.from_dict(data)
print(ndf)
</code></pre>
<p>输出:</p>
<pre><code> ESV trunc
0 esv1 TACGTAGGTGGCGAGCGTTATCCGGAATTATTGGGCGTAAAGCGCG...
1 esv2 TACGGAGGGTGCAAGCGTTATCCGGATTCACTGGGTTTAAAGGGTG...
2 esv3 TACGGGGGGGGCAAGCGTTGTTCGGAATTACTGGGCGTAAAGGGCG...
</code></pre>
<p>接下来,以正确的格式编写phylip文件</p>
<pre><code>with open("test.phy", 'w') as f:
f.write("{:10} {}\n".format(ndf.shape[0], ndf.trunc.str.len()[0]))
for row in ndf.iterrows():
f.write("{:10} {}\n".format(*row[1].to_list()))
</code></pre>
<p>输出<code>test.phy</code>:</p>
<pre><code> 3 253
esv1 TACGTAGGTGGCGAGCGTTATCCGGAATTATTGGGCGTAAAGCGCGCGCAGGCGGTTTCTTAAGTCTGATGTGAAAGCCCACGGCTCAACCGTGGAGGGTCATTGGAAACTGGGAAACTTGAGTGCAGAAGAGGAAAGCGGAATTCCACGTGTAGCGGTGAAATGCGTAGAGATGTGGAGGAACACCAGTGGCGAAGGCGGCTTTCTGGTCTGTAACTGACGCTGAGGCGCGAAAGCGTGGGGAGCAAACAGG
esv2 TACGGAGGGTGCAAGCGTTATCCGGATTCACTGGGTTTAAAGGGTGCGTAGGTGGGTTGGTAAGTCAGTGGTGAAATCTCCGAGCTTAACTTGGAAACTGCCATTGATACTATTAATCTTGAATATTGTGGAGGTTAGCGGAATATGTCATGTAGCGGTGAAATGCTTAGAGATGACATAGAACACCAATTGCGAAGGCAGCTGGCTACACATATATTGACACTGAGGCACGAAAGCGTGGGGATCAAACAGG
esv3 TACGGGGGGGGCAAGCGTTGTTCGGAATTACTGGGCGTAAAGGGCGCGTAGGCGGCCAGACCAAGTCGAGTGTGAAATTGCAGGGCTTAACTTTGCAGGGTCGCTCGATACTGGTCGGCTAGAGTGTGGAAGAGGGTACTGGAATTCCCGGTGTAGCGGTGAAATGCGTAGATATCGGGAGGAACACCAGCGGCGAAGGCGGGTACCTGGGCCAACACTGACGCTGAGGCGCGAAAGCTAGGGGAGCAAACAG
</code></pre>
<p>现在我们可以开始创建我们的系统发育树</p>
<pre><code># Read the sequences and align
aln = AlignIO.read('test.phy', 'phylip')
print(aln)
</code></pre>
<p>输出:</p>
<pre><code>SingleLetterAlphabet() alignment with 3 rows and 253 columns
TACGTAGGTGGCGAGCGTTATCCGGAATTATTGGGCGTAAAGCG...AGG esv1
TACGGAGGGTGCAAGCGTTATCCGGATTCACTGGGTTTAAAGGG...AGG esv2
TACGGGGGGGGCAAGCGTTGTTCGGAATTACTGGGCGTAAAGGG...CAG esv3
</code></pre>
<p>计算距离矩阵:</p>
<pre><code>calculator = DistanceCalculator('identity')
dm = calculator.get_distance(aln)
print(dm)
</code></pre>
<p>输出:</p>
<pre><code>esv1 0
esv2 0.3003952569169961 0
esv3 0.6086956521739131 0.6245059288537549 0
</code></pre>
<p>使用UPGMA算法构建系统发育树,并以ascii格式绘制树</p>
<pre><code>constructor = DistanceTreeConstructor()
tree = constructor.upgma(dm)
Phylo.draw_ascii(tree)
</code></pre>
<p>输出:</p>
<pre><code> ________________________________________________________________________ esv3
_|
| ___________________________________ esv2
|____________________________________|
|___________________________________ esv1
</code></pre>
<p>或者画一幅漂亮的树图:</p>
<pre><code>Phylo.draw(tree)
</code></pre>
<p>输出:</p>
<p><a href="https://i.stack.imgur.com/Xn4dC.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/Xn4dC.png" alt="enter image description here"/></a></p>