使用Python将字符串拆分为整数列表

unimportant information-------- unimportant information-------- -blank line 1 F -1 2 -3 4 5 6 7 (more columns of ints) 2 L 3 -1 3 4 0 -2 1 (more columns of ints) 3 A 3 -1 3 6 0 -2 5 (more columns of ints) -blank line unimportant information-------- unimportant information--------

def pssmMatrix(self,ipFileName,directory): dir = directory filename = ipFileName my_lst = [] #takes every file in fasta folder and put in files list for f in os.listdir(dir): #splits the file name into file name and its extension file, file_ext = os.path.splitext(f) if file == ipFileName: with open(os.path.join(dir,f)) as file_object: for _ in range(3): next(file_object) for line in file_object: my_lst.append(' '.join(line.strip().split())) return my_lst

2条回答

网友

1楼 · 编辑于 2024-10-03 04:30:41

试试这个办法

    import re
    reg = re.compile(r'(?<=[0-9]\s[A-Z]\s)[0-9\-\s]+')

    text = """
    unimportant information    

    unimportant information    
    -blank line

    1 F -1 2 -3 4 5 6 7 (more columns of ints)

    2 L 3 -1 3 4 0 -2 1 (more columns of ints)

    3 A 3 -1 3 6 0 -2 5 (more columns of ints)"""

    ignore_start = 5  # 0,1,2,3 =  4
    expected_array = []
    for index, line in enumerate(text.splitlines()):
    if(index >= ignore_start):
            if reg.search(line):
            result = reg.search(line).group(0).strip()
            # Use Result
            expected_array.append(' '.join(result))

    print(expected_array)
    # Result: [
    #'- 1   2   - 3   4   5   6   7', 
    #'3   - 1   3   4   0   - 2   1', 
    #'3   - 1   3   6   0   - 2   5'
    #]

网友

2楼 · 编辑于 2024-10-03 04:30:41

好吧，在我看来，你有一个文件，里面有你想要的行，你想要的行总是以一个数字和一个字母开头。所以我们能做的就是对它应用一个正则表达式，只得到与该模式匹配的行，并且只得到模式后面的数字

这个表达式看起来像(?<=[0-9]\s[A-Z]\s)[0-9\-\s]+

import re

reg = re.compile(r'(?<=[0-9]\s[A-Z]\s)[0-9\-\s]+')

for line in file:
    if reg.search(line):
        result = reg.search(test).group(0)
        # Use Result
        my_lst.append(' '.join(result))

希望有帮助

相关问题更多 >

编程相关推荐

热门问题

热门文章