在python中用适当的缩进剥离一行

2024-09-30 22:16:01 发布

您现在位置:Python中文网/ 问答频道 /正文

输入:

$target: ENSG00000097007|ABL1
length: 3075
miRNA : hsa-miR-203
length: 22

mfe: -30.5 kcal/mol
p-value: 0.606919

position:  2745
target 5'   C             G     C 3'
             GUGGUCCUGGACA   CAC    
             CACCAGGAUUUGU   GUG    
miRNA  3' GAU             AAA     5'

我必须去掉最后两行,然后给它分配两个数组,读取每个字符并获得一个输出,如下图所示

剥离后,线条的格式应为:

   CACCAGGAUUUGU   GUG    
GAU             AAA     

如果行字符是从第1行读取的,则应以小写形式打印,如果是从第2行读取的,则应以大写形式打印

程序的最终输出应该是 “Gaucachagauuguaaagug”

我们试图读取它的代码并没有像在输入中看到的那样将行完全对齐

下面是我们使用的代码:

 import fileinput
 import sys
 from sys import argv
 script, filename = argv
 file = open(filename)
 og1 = "AGUUCCUUUGUUUUGGUGACUG"
 pattern = "              "
 pattern1 = "miRNA  3'"
 file = open(filename)
 for line in file:
    if line.startswith(pattern):
        n = file.next()
        # print n[9:],#  bound mirna
        for i in range(0, len(og1)):
            print og1[i],
        print "\n" 
        for j in range(0,len(n)):
            print n[j],'

对这个问题还有进一步的意见

target: ENSG00000142208|ENST00000349310|AKT1
length: 992
miRNA : hsa-miR-125b-5p
length: 22

mfe: -23.9 kcal/mol
p-value: 0.610132

position  168
target 5' C     C               A 3'
           CGCAG   GGGGU   AGGGA    
           GUGUU   UCCCA   UCCCU    
miRNA  3' A     CAA     GAG       5'


target: ENSG00000142208|ENST00000349310|AKT1
length: 992
miRNA : hsa-miR-149-3p
length: 21

mfe: -36.6 kcal/mol
p-value: 0.598318

position  798
target 5' C   UGUC     AGG        G 3'
           CGC    GCCCC   CCCUCCCU    
           GUG    CGGGG   GGGAGGGA    
miRNA  3' C   U        GCA          5'


target: ENSG00000142208|ENST00000349310|AKT1
length: 992
miRNA : hsa-miR-185-5p
length: 22

mfe: -27.8 kcal/mol
p-value: 0.606550

position  733
target 5' C       CUCCC   CAGAUGA        C 3'
           CGGGAGC     CCU       UCUCUCCA    
           GUCCUUG     GGA       AGAGAGGU    
miRNA  3' A       AC      A                5'


target: ENSG00000142208|ENST00000349310|AKT1
length: 992
miRNA : hsa-miR-199a-3p
length: 22

mfe: -21.9 kcal/mol
p-value: 0.611970

position  357
target 5' C      CC   CCU      U    C 3'
           AGCCAG   GC   GGGCUG CUGU    
           UUGGUU   CG   UCUGAU GACA    
miRNA  3' A      ACA                  5'


target: ENSG00000142208|ENST00000349310|AKT1
length: 992
miRNA : hsa-miR-451a
length: 21

mfe: -21.2 kcal/mol
p-value: 0.612523

position  416
target 5' C      UCAACC       A     3'
           CUCAGU      UGGUGGC        
           GAGUCA      ACCAUUG        
miRNA  3' U      UU           CCAAA 5'

Tags: targetvaluepositionlengthfilemirprintmol
2条回答

我会按行分割,抓住最后两行,然后迭代压缩在一起的两行,并使用不是" "的字符!你知道吗

def combinebases(base_data):
    lines = base_data.splitlines()[-2:]
    output = list()
    lines[0] = lines[0].lower()
    for ch1, ch2 in zip(*lines):
        output.append(max(ch1, ch2))
    return ''.join(output[10:-4])

可能更安全的结果是返回:

    return re.search("(?<=miRNA  3' )[augc]+", ''.join(output), re.I).group()

然而,如果你的每一个基长度都是一样的,正则表达式就太过了。你知道吗

结果:

>>> txt = """$target: ENSG00000097007|ABL1
length: 3075
miRNA : hsa-miR-203
length: 22

mfe: -30.5 kcal/mol
p-value: 0.606919

position:  2745
target 5'   C             G     C 3'
             GUGGUCCUGGACA   CAC    
             CACCAGGAUUUGU   GUG    
miRNA  3' GAU             AAA     5'"""

>>> combinebases(txt)
'GAUcaccaggauuuguAAAgug'
file = open(filename)                                          
for segment in file.read().split("\n\ntarget"):                          
    interested_lines = segment.split('\n')[-3:-1]  #Fetch last two lines 
    split1 = interested_lines[0].split()                                 
    split2 = interested_lines[1].split()[2:-1]                           
    for i in range(0,len(split1)<len(split2)):                           
        split1.append("")                                                

    req = ""                                                             
    for i in range(0,len(split2)):                                       
        req += split2[i]+split1[i].lower()                               
    for j in range(i+1,len(split1)):                                     
        req += split1[j]                                                 
    print req

输出

AguguuCAAucccaGAGucccu
CgugUcggggGCAgggaggga
AguccuugACggaAagagaggu
AuugguuACAcgUCUGAUGACA
UgagucaUUaccauugCCAAA

相关问题 更多 >