输入:
$target: ENSG00000097007|ABL1
length: 3075
miRNA : hsa-miR-203
length: 22
mfe: -30.5 kcal/mol
p-value: 0.606919
position: 2745
target 5' C G C 3'
GUGGUCCUGGACA CAC
CACCAGGAUUUGU GUG
miRNA 3' GAU AAA 5'
我必须去掉最后两行,然后给它分配两个数组,读取每个字符并获得一个输出,如下图所示
剥离后,线条的格式应为:
CACCAGGAUUUGU GUG
GAU AAA
如果行字符是从第1行读取的,则应以小写形式打印,如果是从第2行读取的,则应以大写形式打印
程序的最终输出应该是 “Gaucachagauuguaaagug”
我们试图读取它的代码并没有像在输入中看到的那样将行完全对齐
下面是我们使用的代码:
import fileinput
import sys
from sys import argv
script, filename = argv
file = open(filename)
og1 = "AGUUCCUUUGUUUUGGUGACUG"
pattern = " "
pattern1 = "miRNA 3'"
file = open(filename)
for line in file:
if line.startswith(pattern):
n = file.next()
# print n[9:],# bound mirna
for i in range(0, len(og1)):
print og1[i],
print "\n"
for j in range(0,len(n)):
print n[j],'
对这个问题还有进一步的意见
target: ENSG00000142208|ENST00000349310|AKT1
length: 992
miRNA : hsa-miR-125b-5p
length: 22
mfe: -23.9 kcal/mol
p-value: 0.610132
position 168
target 5' C C A 3'
CGCAG GGGGU AGGGA
GUGUU UCCCA UCCCU
miRNA 3' A CAA GAG 5'
target: ENSG00000142208|ENST00000349310|AKT1
length: 992
miRNA : hsa-miR-149-3p
length: 21
mfe: -36.6 kcal/mol
p-value: 0.598318
position 798
target 5' C UGUC AGG G 3'
CGC GCCCC CCCUCCCU
GUG CGGGG GGGAGGGA
miRNA 3' C U GCA 5'
target: ENSG00000142208|ENST00000349310|AKT1
length: 992
miRNA : hsa-miR-185-5p
length: 22
mfe: -27.8 kcal/mol
p-value: 0.606550
position 733
target 5' C CUCCC CAGAUGA C 3'
CGGGAGC CCU UCUCUCCA
GUCCUUG GGA AGAGAGGU
miRNA 3' A AC A 5'
target: ENSG00000142208|ENST00000349310|AKT1
length: 992
miRNA : hsa-miR-199a-3p
length: 22
mfe: -21.9 kcal/mol
p-value: 0.611970
position 357
target 5' C CC CCU U C 3'
AGCCAG GC GGGCUG CUGU
UUGGUU CG UCUGAU GACA
miRNA 3' A ACA 5'
target: ENSG00000142208|ENST00000349310|AKT1
length: 992
miRNA : hsa-miR-451a
length: 21
mfe: -21.2 kcal/mol
p-value: 0.612523
position 416
target 5' C UCAACC A 3'
CUCAGU UGGUGGC
GAGUCA ACCAUUG
miRNA 3' U UU CCAAA 5'
我会按行分割,抓住最后两行,然后迭代压缩在一起的两行,并使用不是
" "
的字符!你知道吗可能更安全的结果是返回:
然而,如果你的每一个基长度都是一样的,正则表达式就太过了。你知道吗
结果:
输出
相关问题 更多 >
编程相关推荐