擅长:python、mysql、java
<p>我的Python有点生锈了,请原谅。我希望我正确地推断出所需的输出,否则请评论。你知道吗</p>
<p>这假设来自测序实验的样本总是由任意内容的3行偏移量分开,并且每个样本有22行。你知道吗</p>
<pre><code>import re
def extract_data(filename):
numLinesToSkip = 3
offset = 22
seqIdLineNumber = 0
predictionLineNumber = 21
with open(filename, "r") as f:
output = []
while True:
try: head = [next(f) for x in xrange(offset)]
except StopIteration: break
line21 = re.split(r'\s+',head[predictionLineNumber].strip())
sample = head[seqIdLineNumber].rstrip() + "\t" + " ".join(line21)
output.append(sample)
try: [next(f) for x in xrange(numLinesToSkip)]
except StopIteration: break
print "\n".join(output)
if __name__ == "__main__":
extract_data("test.txt")
</code></pre>