<p>因为您还要求提供bash版本,所以这里使用了<code>awk</code><sup>1</sup>。它是有注释的,而且输出的格式是“良好的”,所以代码有点大(大约20行没有注释)</p>
<pre><code>awk '# First record line:
# Storing all column names into elements, including
# the first column name
NR == 1 {firstcol=$1;element[$1]++}
# Each line starting with the second one are datas
# Occurrences are counted with an indexed array
# count[x][y] contains the count of Element y for the Gene x
NR > 2 {element[$2]++;count[$1][$2]++}
# Done, time for displaying the results
END {
# Let us display the first line, column names
## Left-justify the first col, because it is text
printf "%-10s ", firstcol
## Other are counts, so we right-justify
for (i in element) if (i != firstcol) printf "%10s ", i
printf "\n"
# Now an horizontal bar
for (i in element) {
c = 0
while (c++ < 10) { printf "-"}
printf " ";
}
printf "\n"
# Now, loop through the count records
for (i in count) {
# Left justification for the column name
printf "%-10s ", i ;
for(j in element)
# For each counted element (ie except the first one),
# print it right-justified
if (j in count[i]) printf "%10s", count[i][j]
printf "\n"
}
}' tab-separated-input.txt
</code></pre>
<p>结果:</p>
<pre><code>Gene G-box MYC
STBZIP10 4 3
STBZIP1 2 3
</code></pre>
<hr/>
<p><sup>1</sup><em>由于<a href="https://stackoverflow.com/users/1745001/ed-morton">Ed Morton</a></em></p>