正则表达式无法排除带换行符的匹配项

all_no_numb_newline = r'(?:[^\n\d]*\n)' ## I include an extra line just to get more context ## all_no_numb = r'(?:[^\n\d]*)' ## I do not want there to be any numbers on the same line except the ID ## x1 = r'(?!(1-888-555|\(888\)))' ## I am excluding a specific common phone number ## x2 = r'(?![\n\/])\W{0,2}' ## I am excluding line breaks and date formats ## id_re = f'({x1}\d(?:{x2}\d){{16}}\d)' ## This is an ID number 18 digits long with some symbols in between ##

'Mortgage\nID 756953480812037780' ')\n*DT756953480812037780' '\nq75695348081 0233 240' ')\n*DT756953480812037780' '\nq03313375233 0233 329' 'ID 676170114397739293' 'ID NUMBER 676170114397739293' 'ID\n676170114397739293' 'ID676170114397739293' OUTPUT: '756953480812037780' '756953480812037780' '75695348081 0233 240' '756953480812037780' '03313375233 0233 329' '676170114397739293' '676170114397739293' '676170114397739293' '676170114397739293'

2条回答

网友
1楼 · 编辑于 2024-10-03 00:27:52

我不认为Ryszard的回答排除了\n。我用了一种更骇人的方式：
YY = r'(?!-888-)' XX = r'[^A-Za-z\d\n\\\/\)\(]{0,2}' id_re= f'({YY}\d(?:{XX}\d|\d{XX}){{16}}\d)'
YY取消了显示普通电话号码的功能 XX保留所有非字母数字字符，不包括\n。无论我执行了多少lookahead/behind，其他过程都会显示\n。因此，我决定使用一种更简单但更简单的手工路线，消除所有字母数字和\n（以及将日期或电话号码与斜线和括号混淆的额外符号）
这个正则表达式非常成功，我几乎得到了99%的匹配

网友
2楼 · 编辑于 2024-10-03 00:27:52

使用
(?<!\d)\d(?:\s*\d){16}\d(?!\d)
见proof
解释
(?<! look behind to see if there is not: \d digits (0-9) ) end of look-behind \d digits (0-9) (?: group, but do not capture (16 times): \s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible)) \d digits (0-9) ){16} end of grouping \d digits (0-9) (?! look ahead to see if there is not: \d digits (0-9) ) end of look-ahead

相关问题更多 >

编程相关推荐

热门问题

热门文章