擅长:python、mysql、java
<p>你根本不需要主文件。我只是动态生成最终的表。假设将输入文件名作为命令行参数传递给Python脚本:</p>
<pre><code>import sys
from collections import defaultdict
data = defaultdict(dict) # { taxon: { filename: count } }
for filename in sys.argv[1:]:
with open(filename) as infile:
for line in infile:
count, taxon = line.rstrip().split(',')
data[taxon][filename] = count
</code></pre>
<p>现在有了<code>data</code>,这就是输出文件所需的一切。然后可以这样打印:</p>
<pre><code>taxa = data.keys()
print "Sample,{}".format(','.join(taxa))
for filename in sys.argv[1:]:
print filename,
for taxon in taxa:
count = data[taxon].get(filename, "0")
sys.stdout.write("," + count)
print
</code></pre>