Python正则表达式在每次匹配时引用第一行，直到新组开始

2条回答

网友

1楼 · 编辑于 2024-06-26 12:57:45

IIUC您正在尝试将Header(A|B)与下面几行中的整数组合。对于给定的输出，使用简单的split()操作可能比使用re更容易

for group in text.split('This is ')[1:]:
    header, *lines = group.splitlines()
    print(*[header+line.split()[-1] for line in lines])

输出：

HeaderA1 HeaderA2 HeaderA3 HeaderA4 HeaderA5
HeaderB1 HeaderB2

网友

2楼 · 编辑于 2024-06-26 12:57:45

regex和替换与format的混合

假设在标题下方始终有一行i

import re
text = """This is HeaderA
 Line 1
 Line 2
 Line 3
 Line 4
 Line 5
This is HeaderB
 Line 1
 Line 2"""

ordered_matches = [] # global

def custom_match(m, all_matches=ordered_matches):
    p = m.group(0)
    if p.isdigit():
        all_matches[-1] += [p]
    else:
        all_matches += [[p]]
    return '' # doesn't matter

r = re.sub(r'([A-Z0-9]+)$', custom_match, text, flags=re.M)

for m in ordered_matches:
    print(('Header{}{{}} '.format(m[0]) * (len(m)-1)).format(*m[1:]))

输出

HeaderA1 HeaderA2 HeaderA3 HeaderA4 HeaderA5 
HeaderB1 HeaderB2

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python正则表达式在每次匹配时引用第一行，直到新组开始

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >