python中使用分隔符的过滤

2条回答

网友

1楼 · 编辑于 2024-10-03 19:27:08

您必须在循环中执行两种类型的处理，一种比较“长度”，另一种在需要时存储CGTA。我给你写了一个例子，把这些读入字典：

file = open("file.txt", "r")
myDict = {}
myValueDict = {}
action = 'remember'
geneDict = {}

for line in file:
    if line.startswith(">"):
        line = line.rstrip().split("|")
        line_name = line[0]
        line_number = int(line[-1])
        if line_name in myValueDict:
            if myValueDict[line_name] < line_number:
                action = 'remember'
                myValueDict[line_name] = line_number
                myDict[line_name] = line
            else:
                action = 'forget'
        else:
            myDict[line_name] = line
            myValueDict[line_name] = line_number
    else:
        if action == 'remember':
            geneDict[line_name] = line.rstrip()


for key in myDict:
    print(myDict[key])

for key in geneDict:
    print(geneDict[key])

这将忽略较低长度的项。你现在可以按你想的任何方式存储这些dict了。你知道吗

网友

2楼 · 编辑于 2024-10-03 19:27:08

您需要读取文件两次；第一次，跟踪每个条目的最大大小：

largest = {}
with open(inputfile) as f:
    for line in f:
        if line.startswith('>'):
            parts = line.split('|')
            name, length = parts[0][1:], int(parts[-1])
            largest[name] = max(length, largest.get(name, -1))

然后在第二遍中写出副本，但只有那些名称和长度与从第一遍中提取的最大长度相匹配的部分：

with open(inputfile) as f, open(outpufile, 'w') as out:
    copying = False
    for line in f:
        if line.startswith('>'):
            parts = line.split('|')
            name, length = parts[0][1:], int(parts[-1])
            copying = largest[name] == length
        if copying:
            out.write(line)

相关问题更多 >

编程相关推荐

热门问题

热门文章

python中使用分隔符的过滤

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >