Python,复杂的文件迭代?

2024-06-01 07:58:46 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图迭代一个infile,创建一个outfile,这个outfile和infile在这一点上是一样的。然后我想迭代引用文件,并在需要时将TER记录添加到正确的行中。问题是我在outfile中有许多相同的数字,需要将TER记录添加到最后一个。(在现实生活中,引用文件的末尾有不同的数字,我需要保存。你知道吗

import sys
import argparse

def main(argv):
    parser = argparse.ArgumentParser(description='Add TER records')
    parser.add_argument('infile', help='input file (PDB format)')
    parser.add_argument('outfile', help='output file (PDB format)')
    parser.add_argument('reference', help =' ref')
    args = parser.parse_args()

   residues = []
   x = 0


  with open(args.infile, "r") as f, open(args.outfile, "w+") as of,open(args.reference,"r") as rf:
    for line in f:
        of.write(line)
        for line in rf:
            if line[0:3]== "TER":
                resnum = line[22:27]
                resnum_1 = int(resnum)
                residues.append(resnum_1)
    of.seek(0)
    for line in of:
        if line [0:4]== "ATOM":
            res = line[22:27]
            res_2 = int(res)
            for x in residues:
                if x == res_2 and res_2+1 == x+1:
                    of.write("TER\n")
                    x = 0
                else:
                    continue


if __name__ == "__main__":
    main(sys.argv)

我的填充:

ATOM      1  N   GLU D 384      51.765  39.857  23.514  1.00  0.00           N  
ATOM      2  H1  GLU D 384      50.823  39.839  23.150  1.00  0.00           H  
ATOM      3  H2  GLU D 384      51.956  39.044  24.081  1.00  0.00           H  
ATOM      4  H3  GLU D 384      52.469  39.840  22.790  1.00  0.00           H  
ATOM      5  CA  GLU D 384      51.934  41.135  24.345  1.00  0.00           C  
ATOM      6  HA  GLU D 384      53.002  41.062  24.550  1.00  0.00           H  
ATOM      7  CB  GLU D 384      51.712  42.439  23.503  1.00  0.00           C  
ATOM      8  HB2 GLU D 384      52.307  42.297  22.600  1.00  0.00           H  
ATOM      9  HB3 GLU D 384      50.640  42.356  23.323  1.00  0.00           H  
ATOM     10  CG  GLU D 384      52.024  43.786  24.125  1.00  0.00           C  
ATOM     11  HG2 GLU D 384      52.138  44.557  23.363  1.00  0.00           H  
ATOM     12  HG3 GLU D 384      51.201  44.086  24.773  1.00  0.00           H  
ATOM     13  CD  GLU D 384      53.381  43.828  24.935  1.00  0.00           C  
ATOM     14  OE1 GLU D 384      53.634  43.069  25.869  1.00  0.00           O  
ATOM     15  OE2 GLU D 384      54.142  44.711  24.602  1.00  0.00           O  
ATOM     16  C   GLU D 384      51.078  41.093  25.570  1.00  0.00           C  
ATOM     17  O   GLU D 384      49.819  41.006  25.499  1.00  0.00           O  
ATOM     18  N   THR D 385      51.596  41.332  26.746  1.00  0.00           N  
ATOM     19  H   THR D 385      52.606  41.355  26.728  1.00  0.00           H  
ATOM     20  CA  THR D 385      50.815  41.615  27.982  1.00  0.00           C  
ATOM     21  HA  THR D 385      49.834  41.932  27.628  1.00  0.00           H  
ATOM     22  CB  THR D 385      50.763  40.416  28.984  1.00  0.00           C  
ATOM     23  HB  THR D 385      50.235  40.697  29.895  1.00  0.00           H  
ATOM     24  CG2 THR D 385      50.195  39.154  28.283  1.00  0.00           C  
ATOM     25 HG21 THR D 385      49.322  39.526  27.747  1.00  0.00           H  
ATOM     26 HG22 THR D 385      50.838  38.704  27.527  1.00  0.00           H  
ATOM     27 HG23 THR D 385      49.834  38.399  28.981  1.00  0.00           H  
ATOM     28  OG1 THR D 385      52.133  40.236  29.373  1.00  0.00           O  
ATOM     29  HG1 THR D 385      52.186  40.766  30.172  1.00  0.00   

我的参考(以及outfile TER记录应该是什么样子的:

ATOM      1  N   GLU D 384      51.765  39.857  23.514  1.00  0.00           N  
ATOM      2  H1  GLU D 384      50.823  39.839  23.150  1.00  0.00           H  
ATOM      3  H2  GLU D 384      51.956  39.044  24.081  1.00  0.00           H  
ATOM      4  H3  GLU D 384      52.469  39.840  22.790  1.00  0.00           H  
ATOM      5  CA  GLU D 384      51.934  41.135  24.345  1.00  0.00           C  
ATOM      6  HA  GLU D 384      53.002  41.062  24.550  1.00  0.00           H  
ATOM      7  CB  GLU D 384      51.712  42.439  23.503  1.00  0.00           C  
ATOM      8  HB2 GLU D 384      52.307  42.297  22.600  1.00  0.00           H  
ATOM      9  HB3 GLU D 384      50.640  42.356  23.323  1.00  0.00           H  
ATOM     10  CG  GLU D 384      52.024  43.786  24.125  1.00  0.00           C  
ATOM     11  HG2 GLU D 384      52.138  44.557  23.363  1.00  0.00           H  
ATOM     12  HG3 GLU D 384      51.201  44.086  24.773  1.00  0.00           H  
ATOM     13  CD  GLU D 384      53.381  43.828  24.935  1.00  0.00           C  
ATOM     14  OE1 GLU D 384      53.634  43.069  25.869  1.00  0.00           O  
ATOM     15  OE2 GLU D 384      54.142  44.711  24.602  1.00  0.00           O  
ATOM     16  C   GLU D 384      51.078  41.093  25.570  1.00  0.00           C  
ATOM     17  O   GLU D 384      49.819  41.006  25.499  1.00  0.00           O  
TER

ATOM     18  N   THR D 385      51.596  41.332  26.746  1.00  0.00           N  

但我得到的是:

ATOM     96  HA  SER D 391      45.358  52.899  33.158  1.00  0.00           H  
ATOM     97  CB  SER D 391      45.963  51.960  31.364  1.00  0.00           C  
TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

TER

 1.00  0.00           H  
ATOM    100  OG  SER D 391      45.205  52.828  30.574  1.00  0.00           O  

有人能帮忙吗?正如你所看到的,我是编程新手:)


Tags: ofinparserforiflineargsres