递归分析文件夹和子文件夹

2024-10-03 23:27:16 发布

您现在位置:Python中文网/ 问答频道 /正文

假设我有以下文件/文件夹结构

Start Folder
+-Sub1
| +-SubSub1
| +-File1 (100 byte)
+-Sub2
| +-File2 (200 byte)
+-Sub3
| +-File3 (300 byte)
| +-File4 (400 byte)
+-Sub4
  +-File5 (500 byte)
  +-SubSub2
    +-File6 (600 byte)

我需要将其转换为特定格式的文件,该文件由具有以下要求的“块”组成。每个块由一个20字节的报头组成。在每个块头中存储大小(头+数据)。文件夹块将文件块及其子文件夹块作为数据

例如:

File6 Block = 620 bytes (20 header + 600 data)
SubSub2 Block = 640 bytes (20 header + 620 File6 Block)
File5 Block = 520 Bytes (20 header + 500 data)
Sub4 Block = 1180 bytes (20 header + 520 File5 Block + 640 SubSub2 Block)
Sub3 Block = 760 (20 + 320 + 420)
Sub2 = 240
Sub1 = 160
Start Folder = 20 + 160 + 240 + 760 + 1180

我认为递归和os.walk是创建块和计算块大小的关键,但我正在努力让它运行。 最后一个文件的开头应该有“开始文件夹”块

谢谢你的帮助

这大概是我的代码-问题是,文件夹块大小不正确。我认为最好将数据作为递归的一部分“推送”到每个文件夹块中(我现在没有这样做)

我创建了一个数组(lifblocks[]),我将所有块推送到该数组中,最后我将数组中的所有块写入一个文件中

for dirpath, dirnames, filenames in os.walk(path):

        #ignore hidden files and folders (starting with a dot .)
        filenames = [f for f in filenames if not f[0] == '.']
        dirnames[:] = [d for d in dirnames if not d[0] == '.']

        dirsize = 0

        ''' Folder Block (Block Type 3) '''
        lifblocks.append(LIFBlock(blocktype=3, data=''))
        lifblocks[i].setSize(3735929054) #0xDEADC0DE - Just to test and also to verify that folder size is set correct later
        current_block_index = i #save the index to later adjust the size
        i+=1

        for f in filenames:
            fp = os.path.join(dirpath, f)
            size = os.path.getsize(fp)
            size = size + 20 #Add header size

            '''File Block (Block Type 4)'''
            fa = open(fp, "rb")
            file_data = list(fa.read())
            file_data_array = bytearray(file_data)
            fa.close()

            lifblocks.append(LIFBlock(blocktype=4, data=file_data_array))
            i+=1

            dirsize += size
            total_size += size

        dirsize = dirsize + 20 #Add header size
        lifblocks[current_block_index].setSize(dirsize)


        print("\t", dirsize, dirpath)
    print("{0} bytes".format(total_size))

.
.
.
for item in lifblocks:
    test_file.write(item.string())

test_file.close()


Tags: 文件in文件夹fordatasizebytesos
1条回答
网友
1楼 · 发布于 2024-10-03 23:27:16

我想我明白了-这似乎是我想要做的(使用字符大小,而不是字节)

def walkDir(walk_dir):
    #print('walk_dir (absolute) = ' + os.path.abspath(walk_dir))
    fi_content_str = ''
    files_content_str = ''
    subfolders_content_str =''
    fo_dict = {}
    
    for root, subdirs, files in os.walk(walk_dir, topdown=False):
        #ignore hidden files and folders (starting with a dot .)
        files = [f for f in files if not f[0] == '.']
        subdirs[:] = [d for d in subdirs if not d[0] == '.']
    
        print(' \nroot = ' + root)
        
        files_content_str = ''
        for filename in files:
            file_path = os.path.join(root, filename)
            file_size = os.path.getsize(file_path)
            file_header_size = file_size + 20
            print('\t- file %s (full path: %s)' % (filename, file_path))

            with open(file_path, 'rb') as f:
                #FS: = Filesize (fixed to 3 digits) including 20 for header, FN = Filename (fixed to 8 chars), FC = File content | Header Size = 20 
                #FS:041FN:abc.txt FC:the file content..... 
                # 3  3  3    8    3 = 20
                fi_content_str = 'FS:{0:0=3d}FN:{1:8.8}FC:{2}'.format(file_header_size, filename, f.read()) #Content of single file
                
            files_content_str = files_content_str + fi_content_str #Content of all files in current folder

        current_dir = os.path.normpath(root).split(os.sep)[-1] #Get the name of the current dir
        
        if files_content_str != '': #Found at least one file
            print '\tFound files: ' + files_content_str
        
        else: # Found a folder without files
            print '\tempty directory (without Files)'
        
        subfolders_content_str = ''
        for subdir in subdirs:
            subfolders_content_str = subfolders_content_str + fo_dict[os.path.join(root, subdir)]
            print('\t- subdirectory '+ os.path.join(root, subdir) +'\t' + subfolders_content_str)
            
        #DS: = Dir Size, DN = Dir Name, DC = Dir Content (either subdirs or files)
        fo_dict[root] = 'DS:{0:0=3d}DN:{1:8.8}DC:{2}'.format(len(files_content_str) + len(subfolders_content_str) + 20 , current_dir, files_content_str + subfolders_content_str)
        #print '\nFOCS: ' + fo_dict[root] +'\n'

    print '\nResult:\n{0}'.format(fo_dict[root])

相关问题 更多 >