确定lis中附加值的值位置

2024-06-01 14:17:11 发布

您现在位置:Python中文网/ 问答频道 /正文

一个包含不同元素的文件以提取并包含到列表中。 当我尝试填写将包含在词典中的列表时,我需要检索特定信息并将其包含在列表中的确定位置。 任何python方面的帮助都将不胜感激

程序集报告目录示例:

Assembly name:  Pav631_1.0 
Organism name: Pseudomonas avellanae BPIC 631 (g-proteobacteria) 
Infraspecific name:  strain=BPIC 631 
Taxid:          11547 
BioSample:      SAMN02471966 
BioProject:     PRJNA84293 
Submitter:      University of Toronto Centre for the Analysis of Genome Evolution and Function Date:         2012-10-10 
Assembly type:  n/a 
Release type:   major 
Assembly level: Scaffold 
Genome representation: full 
WGS project:    AKBS01 
Assembly method: CLC 

以下是我试过的台词:

report_dict = {}
for root, dirs, reports in os.walk(assembly_report_dir):
    for report in reports:
    accession = '_'.join(report.strip().split('/')[-1].replace('_assembly_report.txt', '').split('_')[0:2])

    path = os.path.join(assembly_report_dir, report) # path = the name of the genbank with the complete path to it

    with open(path, 'r') as inputfile:
        lines = inputfile.readlines()
        description = []
        for line in lines:

            if line.startswith('Organism name:  '):
                organism = line.strip().split(':  ')[-1].split(' (', 1)[0]
                species = ' '.join(organism.split(' ')[0:2])
                description.append(species)

            elif line.startswith('Infraspecific name:  strain='):
                strain = line.strip().replace(' ','').split('strain=')[-1]
                description.append(strain)

            elif line.startswith('Assembly name:  '):
                assembly = line.strip().split(':  ')[-1]
                description.(assembly)

          report_dict[accession] = description  

print report_dict

问题是合并到列表(程序集)的最后一个参数包含在列表的第一个位置,而不是最后一个位置

我得到的结果是:

description = ["assembly", "species, "strain"]

我想要这样的清单:

description = ["species", "strain", "assembly"]

Tags: ofthepathnamereport列表forline
1条回答
网友
1楼 · 发布于 2024-06-01 14:17:11

一个非常粗糙和肮脏的方式做…因为你的名单长度是固定的这段代码将工作没有问题

 report_dict = {}
for root, dirs, reports in os.walk(assembly_report_dir):
    for report in reports:
    accession = '_'.join(report.strip().split('/')[-1].replace('_assembly_report.txt', '').split('_')[0:2])

    path = os.path.join(assembly_report_dir, report) # path = the name of the genbank with the complete path to it

    with open(path, 'r') as inputfile:
        lines = inputfile.readlines()
        MAX_LENGTH = 3
        description = ['null' for x in range(MAX_LENGTH)]
        for line in lines:

            if line.startswith('Organism name:  '):
                organism = line.strip().split(':  ')[-1].split(' (', 1)[0]
                species = ' '.join(organism.split(' ')[0:2])
                description[0] = str(species)

            elif line.startswith('Infraspecific name:  strain='):
                strain = line.strip().replace(' ','').split('strain=')[-1]
                description[1] = str(strain)

            elif line.startswith('Assembly name:  '):
                assembly = line.strip().split(':  ')[-1]
                description[2] = str(assembly)

          report_dict[accession] = description  

print report_dict

相关问题 更多 >