正向lookback，按\t切分单词，直到\n

#position 4450 4452 4455 4465 4476 4496 D110 D111 D112 D114 D116 D118 D23 D24 D27 D29 D30 D56 D59 D69 D85 D88 D90 D91 JW1 JW10 JW15 JW22 JW28 JW3 JW35 JW39 JW43 JW45 JW47 JW49 JW5 JW52 JW54 JW56 JW57 JW59 JW66 JW7 JW70 JW75 JW77 JW9 REF_OR74A

import re file = open('src.txt','r') f = list(file) file.close() pattern = '(?<=#position).*' regex = re.compile(pattern) regex.findall(''.join(f)) ['\t4450\t4452\t4455\t4465\t4476\t4496\tD110\tD111\tD112\tD114\tD116\tD118\tD23\tD24\tD27\tD29\tD30\tD56\tD59\tD69\tD85\tD88\tD90\tD91\tJW1\tJW10\tJW15\tJW22\tJW28\tJW3\tJW35\tJW39\tJW43\tJW45\tJW47\tJW49\tJW5\tJW52\tJW54\tJW56\tJW57\tJW59\tJW66\tJW7\tJW70\tJW75\tJW77\tJW9\tREF_OR74A']

1条回答

网友

1楼 · 发布于 2024-10-01 17:34:46

你需要使用正则表达式吗？列表切片和字符串方法看起来不像你说的那么混乱

比如：

f = open('src.txt','r')
for line in f:
    if line.startswith("#position"):
        l = line.split()  # with no arguments it splits on all whitespace characters
        l = l[1:]         # get rid of the "#position" tag
        break

从那里进一步操纵

相关问题更多 >

编程相关推荐

热门问题

热门文章