我正在查一个文件。如果这行以“SegID”开头,我想看它后面的第21行,如果这行以“细胞质”以外的任何东西开头,我想把以SegID开头的行和以“细胞质”以外的任何东西开头的行写入一个文件。你知道吗
到目前为止,我有:
import sys
import argparse
import operator
import re
import itertools
def main (argv):
parser = argparse.ArgumentParser(description='find a location')
parser.add_argument('infile', help='file to process')
parser.add_argument('outfile', help='file to produce')
args = parser.parse_args()
tag = "SeqID:"
tag2 = "Cytoplasmic"
with open(args.infile, "r") as f,open(args.outfile,"w+") as of:
file_in = f.readlines()
for line in file_in:
if line.startswith(tag)and line[21:] != "Cytoplasmic":
of.write(line)
if __name__ == "__main__":
main(sys.arg
以下是输入文件的示例:
SeqID: YP_008914846.1 opacity protein [Neisseria gonorrhoeae FA 1090]
Analysis Report:
CMSVM- Unknown [No details]
CytoSVM- Unknown [No details]
ECSVM- Unknown [No details]
ModHMM- Unknown [No internal helices found]
Motif- Unknown [No motifs found]
OMPMotif- Unknown [No motifs found]
OMSVM- OuterMembrane [No details]
PPSVM- Unknown [No details]
Profile- Unknown [No matches to profiles found]
SCL-BLAST- OuterMembrane [matched 60392864: Opacity protein opA54 precursor]
SCL-BLASTe- Unknown [No matches against database]
Signal- Unknown [No signal peptide detected]
Localisation Scores:
OuterMembrane 10.00
Extracellular 0.00
Periplasmic 0.00
Cytoplasmic 0.00
CytoplasmicMembrane 0.00
Final Prediction:
OuterMembrane 10.00
-------------------------------------------------------------------------------
SeqID: YP_008914847.1 hypothetical protein NGO0146a [Neisseria gonorrhoeae FA 1090]
Analysis Report:
CMSVM- Unknown [No details]
CytoSVM- Unknown [No details]
ECSVM- Unknown [No details]
ModHMM- Unknown [No internal helices found]
Motif- Unknown [No motifs found]
OMPMotif- Unknown [No motifs found]
OMSVM- Unknown [No details]
PPSVM- Unknown [No details]
Profile- Unknown [No matches to profiles found]
SCL-BLAST- Unknown [No matches against database]
SCL-BLASTe- Unknown [No matches against database]
Signal- Unknown [No signal peptide detected]
Localization Scores:
CytoplasmicMembrane 2.00
Cytoplasmic 2.00
OuterMembrane 2.00
Periplasmic 2.00
Extracellular 2.00
Final Prediction:
Unknown
我的Python有点生锈了,请原谅。我希望我正确地推断出所需的输出,否则请评论。你知道吗
这假设来自测序实验的样本总是由任意内容的3行偏移量分开,并且每个样本有22行。你知道吗
您可以尝试使用以下方法:
相关问题 更多 >
编程相关推荐