使用python打开和编辑文件夹中的多个文件

>YP_009208724.1 hypothetical protein ADP65_00072 [Achromobacter phage phiAxp-3] MSNVLLKQ... >YP_009220341.1 terminase large subunit [Achromobacter phage phiAxp-1] MRTPSKSE... >YP_009226430.1 DNA packaging protein [Achromobacter phage phiAxp-2] MMNSDAVI...

with open('Achromobacter.fasta', 'r') as fasta_file: out_file = open('./fastas3/Achromobacter.fasta', 'w') for line in fasta_file: line = line.rstrip() if '[' in line: line = line.split('[')[-1] out_file.write('>' + line[:-1] + "\n") else: out_file.write(str(line) + "\n")

import glob for fasta_file in glob.glob('*.fasta'): outfile = open('./fastas3/'+fasta_file, 'w') with open(fasta_file, 'r'): for line in fasta_file: line = line.rstrip() if '[' in line: line2 = line.split('[')[-1] outfile.write('>' + line2[:-1] + "\n") else: outfile.write(str(line) + "\n")

2条回答

网友

1楼 · 编辑于 2024-10-01 13:38:03

考虑到您现在能够更改文件名的内容，您需要自动化该过程。我们通过删除两次打开文件时使用的文件处理程序来更改一个文件的函数。在

def file_changer(filename):
    data_to_put = ''
    with open(filename, 'r+') as fasta_file:
        for line in fasta_file.readlines():
            line = line.rstrip()
            if '[' in line:
                line = line.split('[')[-1]
                data_to_put += '>' + str(line[:-1]) + "\n"
            else:
                data_to_put += str(line) + "\n"
        fasta_file.write(data_to_put) 
        fasta_file.close()

现在我们需要遍历你所有的文件。因此，让我们使用glob模块

^{pr2}$

网友

2楼 · 编辑于 2024-10-01 13:38:03

您正在迭代文件名，这将为您提供名称中的所有字符，而不是文件的行。以下是代码的更正版本：

import glob

for fasta_file_name in glob.glob('*.fasta'):
    with open(fasta_file_name, 'r') as fasta_file, \
            open('./fastas3/' + fasta_file_name, 'w') as outfile:
        for line in fasta_file:
            line = line.rstrip()
            if '[' in line:
                line2 = line.split('[')[-1]
                outfile.write('>' + line2[:-1] + "\n")
            else:
                outfile.write(str(line) + "\n")

作为Python脚本的替代方法，您只需从命令行使用sed：

^{pr2}$

这将修改所有文件，因此请考虑先复制它们。在

相关问题更多 >

编程相关推荐

热门问题

热门文章