将列表中的字母字符串转换为数字

2024-10-05 10:08:48 发布

您现在位置:Python中文网/ 问答频道 /正文

我已将一个文件读入如下列表:

['>chr1_sliding:1-1000\n', 'TCATGGCTATTTTCATAAAAAATGGGGGTTGTGTGGCCATTTATCATCGACTAGAGGCTCATAAACCTCACCCCACATATGTTTCCTTGCCATAGATTACATTCTTGGATTTCTGGTGGAAACCAT\n', '\n', '>chr1_sliding:901-1900\n', 'TCATGGCTATTTTCATAAAAAATGGGGGTTGTGTGGCCATTTAT....]

我想根据本词典将字母转换为数字:

dict = {"A": 0, "T": 1,"G": 2, "C": 3}

我已经这样做了:

with open("/Users/Downloads/test") as file_in:
    lines = []
    for line in file_in:
        lines.append(line)

for line in lines:
    try:
        print(dict[line])
    except KeyError:
        print("header")

但是,我每行都会打印“标题”:

输出

header
header 
header
header

预期输出:

header
13012...
header
13012...

Tags: 文件in列表forlinedictfileheader
3条回答

首先定义一个转换函数,该函数将根据以下规则转换给定的行:

def transformData(line):
    transform_dict = {"A": 0, "T": 1, "G": 2, "C": 3}

    for char, val in transform_dict.items():
        line = line.replace(char, str(val))

    return line

然后继续遍历每一行,并检查它是否是要转换的有效行。如果是有效行,则将其传递给transform函数并存储结果

data = ['>chr1_sliding:1-1000\n', 'TCATGGCTATTTTCATAAAAAATGGGGGTTGTGTGGCCATTTATCATCGACTAGAGGCTCATAAACCTCACCCCACATATGTTTCCTTGCCATAGATTACATTCTTGGATTTCTGGTGGAAACCAT\n', '\n', '>chr1_sliding:901-1900\n', 'TCATGGCTATTTTCATAAAAAATGGGGGTTGTGTGGCCATTTAT....\n']

headers = []    # For storing the final transformed data

for line in data:
    if not line.startswith('>') and line.strip():    # Check if a given line is valid
        headers.append(transformData(line))          # Transform the line and store it

最后,按照您希望的方式打印结果:

for line in headers:
    print('header', line, sep='\n')

输出


header
13012...
header
13012...

您的字典将字符作为键,而不是行

for line in lines:
    for char in line:
        print(dict.get(char, char))

在列表上循环检查是否有一个包含所有大写字母的字符串,然后通过dict转换它,怎么样

以下是方法:

lines = [
    '>chr1_sliding:1-1000\n',
    'TCATGGCTATTTTCATAAAAAATGGGGGTTGTGTGGCCATTTATCATCGACTAGAGGCTCATAAACCTCACCCCACATATGTTTCCTTGCCATAGATTACATTCTTGGATTTCTGGTGGAAACCAT\n',
    '\n',
    '>chr1_sliding:901-1900\n',
    'TCATGGCTATTTTCATAAAAAATGGGGGTTGTGTGGCCATTTAT',
]
d = {"A": 0, "T": 1, "G": 2, "C": 3}

for line in lines:
    line = line.strip()
    if line.isupper():
        print("".join(str(d[ch]) for ch in line), end="")
    else:
        print(line)

输出:

>chr1_sliding:1-1000
130122310111130100000012222211212122330111013013203102022313010003313033330301012111331123301020110301131122011131221220003301
>chr1_sliding:901-1900
13012231011113010000001222221121212233011101

相关问题 更多 >

    热门问题