Python中的正则表达式和格式

INPUT = [ 'ABCD , D.O.B: - Jun/14/1999.', 'EFGH , DOB; - Jan/10/1998,', 'IJKL , D-O-B - Jul/15/1985..', 'MNOP , (DOB)* - Dec/21/1999,', 'QRST , *DOB* - Apr/01/2000.', 'UVWX , D O B, - Feb/11/2001 ' ]

import re INPUT = [ 'ABCD , D.O.B: - Jun/14/1999.', 'EFGH , DOB; - Jan/10/1998,', 'IJKL , D-O-B - Jul/15/1985..', 'MNOP , (DOB)* - Dec/21/1999,', 'QRST , *DOB* - Apr/01/2000.', 'UVWX , D O B, - Feb/11/2001 ' ] def formatted_def(input): for n in input: t = re.sub('[^a-zA-Z0-9 ]+','',n).split('DOB') print(t) formatted_def(INPUT)

3条回答

网友

1楼 · 编辑于 2024-06-24 11:45:32

除其他答案外，您还可以使用^{}：

INPUT = [
    'ABCD , D.O.B: - Jun/14/1999.',
    'EFGH , DOB; - Jan/10/1998,',
    'IJKL , D-O-B - Jul/15/1985..',
    'MNOP , (DOB)* - Dec/21/1999,',
    'QRST , *DOB* - Apr/01/2000.',
    'UVWX , D O B, - Feb/11/2001 '
]

pattern = r'(?i)^([a-z]+).*([a-z]{3}/\d{2}/\d{4}).*$'

OUTPUT = [re.sub(pattern, r'\1, \2', x) for x in INPUT]

# OUTPUT:

[
    'ABCD, Jun/14/1999',
    'EFGH, Jan/10/1998',
    'IJKL, Jul/15/1985',
    'MNOP, Dec/21/1999',
    'QRST, Apr/01/2000',
    'UVWX, Feb/11/2001'
]

网友

2楼 · 编辑于 2024-06-24 11:45:32

您可以使用re.findall：

import re
l = ['ABCD , D.O.B: - Jun/14/1999.', 'EFGH , DOB; - Jan/10/1998,', 'IJKL , D-O-B - Jul/15/1985..', 'MNOP , (DOB)* - Dec/21/1999,', 'QRST , *DOB* - Apr/01/2000.', 'UVWX , D O B, - Feb/11/2001 ']
final_data = [', '.join(re.findall('^\w+|[a-zA-Z]+/\d+/\d+(?=\W)', i)) for i in l]

输出：

['ABCD, Jun/14/1999', 'EFGH, Jan/10/1998', 'IJKL, Jul/15/1985', 'MNOP, Dec/21/1999', 'QRST, Apr/01/2000', 'UVWX, Feb/11/2001']

网友

3楼 · 编辑于 2024-06-24 11:45:32

import re
re.findall(r'(\w+)\s+,.*?-\s+([^., ]*)', ' '.join(INPUT))
# [('ABCD', 'Jun/14/1999'), ('EFGH', 'Jan/10/1998'), ('IJKL', 'Jul/15/1985'), ('MNOP', 'Dec/21/1999'), ('QRST', 'Apr/01/2000'), ('UVWX', 'Feb/11/2001')]

相关问题更多 >

编程相关推荐

热门问题

热门文章