使用regex从os.walk给定的文件名中提取子字符串

2024-10-08 18:31:00 发布

您现在位置:Python中文网/ 问答频道 /正文

我基本上是想从这个操作系统中获取3条信息

  1. 是否有一个名为unit的文件夹?如果是,我想知道文件夹的内容
  2. 在这些内容中,是否有格式为:\d\d\d\d\d\d_DAY\d\d的文件夹名称?如果是这样,我想提取第一个(\d\d\d\d\d\d)并将其另存为date
  3. 在该文件夹树中,是否有MXF文件?如果是,请将上一个文件夹的内容移动到:'Users/davealterman/Desktop/Volumes/HOW_TO_OCM/RAID OCM/FS4/' + 'DATE'

我是一个新的编码,这一直是一个头痛。任何帮助都将不胜感激,我知道这个代码没有意义,但我有点沮丧


import os, glob, re, shutil 
from pathlib import Path

FS5_path = 'Users/davealterman/Desktop/Volumes/HOW_TO_OCM/RAID OCM/FS4'

home_path = '/Users/davealterman/Desktop/Volumes/HOW_TO_OCM/_FROM PRODUCTION'

os.chdir(home_path)

subList = []
i = -1
for dirs, subs, files in os.walk(home_path):

    for sub in subs:
        print(sub)
        subList.append(sub)
        i + 1
        formatRegex = re.compile(r'(\d{6})(_DAY)(\d{2})')
        mo = formatRegex.search(sub)
        mo.group()


Tags: topath文件夹内容homeosusershow
1条回答
网友
1楼 · 发布于 2024-10-08 18:31:00

试试这个

import os, glob, re, shutil 
from pathlib import Path

FS5_path = 'Users/davealterman/Desktop/Volumes/HOW_TO_OCM/RAID OCM/FS4'

home_path = '/Users/davealterman/Desktop/Volumes/HOW_TO_OCM/_FROM PRODUCTION'

os.chdir(home_path)

subList = []
i = -1
for dirs, subs, files in os.walk(home_path):
    # Is there a folder with the name unit in it? If so, I want to know the contents of the folder.
    
    # filter folders containing `unit`
    searching_for = 'unit'
    matched_folders = filter(lambda folder_name: searching_for in folder_name, subs)    
    for folder in matched_folders:
        print(
            os.listdir(
                os.path.join(home_path, folder)
            )
        )
    
    # Within those contents, is there a folder name with the format: \d\d\d\d\d\d_DAY\d\d? If so, I want to extract the first (\d\d\d\d\d\d) and save it as date.
    date_regex = re.compile(r'(\d{5})_DAY\d{2}')

    folders_matching_regex = filter(lambda file: date_regex.fullmatch(file), subs)
    dates = [date_regex.match(folder)[0] for folder in folders_matching_regex]
    date = dates[0]
    mxf_regex = re.compile(r'.*\.pdf')
    mxf_files = filter(lambda file: mxf_regex.fullmatch(file), files)
    for file in mxf_files:
        dest_dir = FS5_path + date + file
        shutil.move(file, dest_dir)

相关问题 更多 >

    热门问题