Python正则表达式对文件名的某些部分重新排序：从名称中删除重复项并拆分组

PNoRegex = re.compile(r"""^(.*?) (PNo\s\d{4}|PNo\d{4}|Pno\s\d{4}|Pno\d{4}) # part number details \s* #remove white space after PNo string (.*)$ # all text after Part No """, re.VERBOSE) for originalFile in os.listdir('.'): fileNameText = PNoRegex.search(originalFile) # Skip files without a Regex match if fileNameText == None: continue # separate the groups beforePNo = fileNameText.group(1) PNo = fileNameText.group(2) afterPNo = fileNameText.group(3) # Form the reordered filename. newFileName = PNo + ' - ' + beforePNo + afterPNo

2条回答

网友

1楼 · 编辑于 2024-09-30 14:23:38

使用re.sub：

re.sub(r'(?i)^(.*?)\s*(PNo)\s*(\d{4})\s*(?:-\s*)?(.*)$', r'\2 \3 - \1 \4', string)

见proof

说明：

NODE                     EXPLANATION
                                        
  (?i)                     set flags for this block (case-
                           insensitive) (with ^ and $ matching
                           normally) (with . not matching \n)
                           (matching whitespace and # normally)
                                        
  ^                        the beginning of the string
                                        
  (                        group and capture to \1:
                                        
    .*?                      any character except \n (0 or more times
                             (matching the least amount possible))
                                        
  )                        end of \1
                                        
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
                                        
  (                        group and capture to \2:
                                        
    PNo                      'PNo'
                                        
  )                        end of \2
                                        
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
                                        
  (                        group and capture to \3:
                                        
    \d{4}                    digits (0-9) (4 times)
                                        
  )                        end of \3
                                        
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
                                        
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
                                        
    -                        '-'
                                        
    \s*                      whitespace (\n, \r, \t, \f, and " ") (0
                             or more times (matching the most amount
                             possible))
                                        
  )?                       end of grouping
                                        
  (                        group and capture to \4:
                                        
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
                                        
  )                        end of \4
                                        
  $                        before an optional \n, and the end of the
                           string

网友

2楼 · 编辑于 2024-09-30 14:23:38

您可以使用字符类并匹配可选的空白字符，将替换缩短为(P[nN]o)\s?(\d{4})

如果pno和数字之间有空格，可以使用2个捕获组，而不是1个

要匹配可选连字符，可以使用字符类[-\s]*扩展匹配空格字符或连字符

这将为当前示例数据中的零件生成单独的组

^(.*?)(P[nN]o)\s?(\d{4})[-\s]*(.*)$

Regex demo

相关问题更多 >

编程相关推荐

热门问题

热门文章