如何用Python编写通用/灵活的正则表达式?

2024-09-29 19:34:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我在学regex。你知道,人们可能有中间名,也可能没有中间名。我想编写一个灵活的正则表达式,以便将来编译和使用。但是,我不能这样做。如有任何建议和/或帮助,将不胜感激。下面是我的正则表达式,表示没有中间名的name

import re
p = re.compile(r"\W+\s+(?P<firstname>\w+)\s+(?P<lastname>\w+)")
name = "John Drell"
m = p.search(name)

我没有任何问题的名称没有中间名。但是,我无法为可能有中间名或没有中间名的名称编写正确的灵活名称。这是我的一个测试代码

import re
p = re.compile(r"\W+\s+(?P<firstname>\w+)\s+(?:P<middlename>[A-Z]*)(?P<lastname>\w+)")
name = "John M. Drell"
m = p.search(name)

这个脚本只允许使用中间名命名,否则我会收到错误消息:“NonType”对象没有属性“groups”

如果你能纠正我的错误,我将不胜感激


Tags: nameimportre名称search错误firstnamejohn
1条回答
网友
1楼 · 发布于 2024-09-29 19:34:08

使用split()

names = ["John M. Drell", "John Drell"]
for name in names:
    firstname, *middlenames, lastname = name.split()
    print(f'First name: {firstname}, Middle name(s): {" ".join(middlenames)}, Last name: {lastname}')

Python proof

使用regex,学习使用可选组和\S匹配任何非空白字符:

^(?P<firstname>\S+)(?:\s+(?P<middlename>\S+(?: +\S+)*))?\s+(?P<lastname>\S+)$

regex proof

解释

                                        
  ^                        the beginning of the string
                                        
  (?P<firstname>           group and capture to "firstname":
                                        
    \S+                      non-whitespace (all but \n, \r, \t, \f,
                             and " ") (1 or more times (matching the
                             most amount possible))
                                        
  )                        end of "firstname"
                                        
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
                                        
    \s+                      whitespace (\n, \r, \t, \f, and " ") (1
                             or more times (matching the most amount
                             possible))
                                        
    (?P<middlename>            group and capture to "middlename":
                                        
      \S+                      non-whitespace (all but \n, \r, \t,
                               \f, and " ") (1 or more times
                               (matching the most amount possible))
                                        
      (?:                      group, but do not capture (0 or more
                               times (matching the most amount
                               possible)):
                                        
         +                       ' ' (1 or more times (matching the
                                 most amount possible))
                                        
        \S+                      non-whitespace (all but \n, \r, \t,
                                 \f, and " ") (1 or more times
                                 (matching the most amount possible))
                                        
      )*                       end of grouping
                                        
    )                        end of "middlename"
                                        
  )?                       end of grouping
                                        
  \s+                      whitespace (\n, \r, \t, \f, and " ") (1 or
                           more times (matching the most amount
                           possible))
                                        
  (?P<lastname>             group and capture to "lastname":
                                        
    \S+                      non-whitespace (all but \n, \r, \t, \f,
                             and " ") (1 or more times (matching the
                             most amount possible))
                                        
  )                        end of "lastname"
                                        
  $                        before an optional \n, and the end of the
                           string

相关问题 更多 >

    热门问题