
2024-09-30 16:42:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个表格格式的大blast文件,目标序列的数量不受限制,所以解析需要很长时间。我希望将每个查询序列的命中数减少到前10个。 我的python是基本的,但是到目前为止我所拥有的

import sys

blastfile = open(sys.argv[1],"r")


for line in blastfile:
    b = line.split()[0]

uniqcolumn1 = list(set(column1list))

counter = 0

for val in uniqcolumn1:
    #print val
    for line in blastfile:
        #print line
        while counter <= 10:
            if line.startswith(val):
                print line
                counter =+ 1



我认为代码的第一部分工作正常,直到' uniqcolumn1=list(set(column1list))',那么我无法让它打印以列表中每个字符串开头的前十行。在

Tags: 文件inforsyslinecounter序列val


import sys
NUM_TO_PRINT=10     # good practice - use names rather than raw numbers

blastfile = open(sys.argv[1],"r")

titles={};  # an empty dictionary.
    # This will map titles to counts of how many times a line with that title
    # has been printed.

for line in blastfile:
    title = line.split()[0];    # assuming the title is space-delimited, and that the line is not empty
    num_printed = titles.get(title, 0);     # 0 is the default
    if num_printed<NUM_TO_PRINT:
        print line,    # comma because _line_ already has a newline - without the comma, you get a blank line after every printed line
        num_printed += 1
        titles[title] = num_printed     # save where we are 




Hello! My name is Bob and I can't think of anything to
put in this file so I'm blabbering on about nonsense
in hopes that you won't realise that this text is not
important but the code in the actually file, though I
think that you wouldn't mind reading this long file.



如果要运行这段代码(并且没有出现有关目录的错误),则只需打印一次file_to_read.txt。要解决这个问题,您只需在阅读之间添加一个f.seek(0, 0)。例如:

f = open("file_to_read.txt", "r")
for line in f: print line
f.seek(0, 0)
for lien in f: print line


import sys
# Here is your reading of file
blastfile = open(sys.argv[1],"r")
column1list = []
# Here is the first time you read the file
for line in blastfile:
    b = line.split()[0]
# Add a line to move back to the start before the
# next reading
blastfile.seek(0, 0)

uniqcolumn1 = list(set(column1list))

for val in uniqcolumn1:
    # Move the counter inside to refresh it after every iteration
    counter = 0
    # Here is the second time you read your file
    for line in blastfile:
        while counter <= 10:
            if line.startswith(val):
                print line
                counter += 1
    # Since you are going to read the file the next iteration,
    # .seek the file
    blastfile.seek(0, 0)



for val in uniqcolumn1:
    # Move the counter in
    counter = 0
    # Move the while loop out
    while counter <= 10:
        for line in blastfile:
            if line.startswith(val):
                print line,
                counter += 1
    blastfile.seek(0, 0)



for val in uniqcolumn1:
    # Move counter in
    counter = 0
    # Remove while statement
    for line in blastfile:
        # Add additional condition to if statement
        if line.startswith(val) and counter <= 10:
            print line,
            counter += 1
        elif counter > 10:
    blastfile.seek(0, 0)


相关问题 更多 >