如何解压缩MSZIP块？

#!/bin/env python import os def to_bin(in_item): b = ord(chr(in_item)) retval = bin(b).replace("0b", "") if len(retval) < 8: lx = 8 - len(retval) q = "0" * lx retval = q + retval return retval def swapbytes(in_bit_str): p = in_bit_str[:4] q = in_bit_str[4:] return q + p pth = os.path.join("comp", "test_0.dat") fp = open(pth, 'rb') fpd = fp.read(10) fp.close() dx = "" for item in fpd: cx = to_bin(item) scx = swapbytes(cx) dx += scx print(cx) print("---> %s" % scx) print("Final: %s" % dx) bfinal = dx[0] btype = dx[1:3] print("Bfinal: %s" % bfinal) print("Btype: %s" % btype) rest = dx[3:] hlit = rest[:5] hdist = rest[5:11] hclen = rest[11:15] hlitd = int(hlit, 2) hdistd = int(hdist, 2) hclend = int(hclen, 2) print("HLIT: %s [%d]" % (hlit, hlitd)) print("HDIST: %s [%d]" % (hdist, hdistd)) print("HCLEN: %s [%d]" % (hclen, hclend))

$ python tstdcmp.py 11101101 ---> 11011110 10011101 ---> 11011001 01111001 ---> 10010111 01110000 ---> 00000111 00011100 ---> 11000001 11000111 ---> 01111100 01110101 ---> 01010111 10000111 ---> 01111000 00000111 ---> 01110000 11100000 ---> 00001110 Final: 1101111011011001100101110000011111000001011111000101011101111000011100000 0001110 Bfinal: 1 Btype: 10 HLIT: 11110 [30] HDIST: 11011 [27] HCLEN: 0011 [3]

1条回答

网友

1楼 · 发布于 2024-10-01 07:39:47

不，你认为你从RFC1951得到的“基本概念”是完全错误的。首先，从最低有效到最高有效读取每个字节中的位。其次，不反转流中的字节。前八位首先从第一个字节读取，后八位从第二个字节读取，依此类推。（存储块中的长度以小端存储，但既不反转也不反转。这只是16位长度在字节流中序列化的方式。）
正确读取位后，HLIT等值将完全按照RFC中的说明进行存储。前五位是文字/长度代码的数量减去257。所以你取五位的值，它给出一个0..31的数字，再加上257。这个数字的范围是257到288。允许的范围实际上是257..286，如同一行所述，因此五位的最后两个可能值30和31不应出现在有效的deflate流中
RFC1951一点也不令人困惑。它是对格式的清晰完整的描述。但是，您需要有足够的压缩背景，特别是哈夫曼代码，才能理解它。RFC并不是一本关于压缩的教科书，也不是一本关于整数如何以位编码的教科书
很明显，你需要一些时间才能把这一切弄清楚。幸运的是，您不需要编写自己的充气机。您可以改为使用zlib。阅读zlib.h中所有inflate函数的文档
在CAB文件中，MSZIP CFDATA块使用以前CFDATA块的历史记录，直到达到文件夹边界。即使每个块都是正确终止的放气流，下一个块也可以引用来自前一个块的未压缩数据。要在第一次充气后处理CFDATA块，需要使用zlib的inflateResetKeep()函数重新启动充气过程，同时保留上一次充气操作中的字典

以下是使用infgen对您提供的deflate流的初始字节进行解码，以供参考：

! infgen 2.5 output
!
last            ! 1
dynamic         ! 10
count 286 30 16     ! 1100 11101 11101
code 16 4       ! 100
code 17 7       ! 111
code 0 4        ! 100 000
code 8 3        ! 011
code 7 4        ! 100
code 9 3        ! 011
code 6 4        ! 100
code 10 3       ! 011
code 5 4        ! 100
code 11 3       ! 011
code 4 5        ! 101
code 12 3       ! 011
code 3 7        ! 111
code 2 6        ! 110 000
lens 6          ! 1100
lens 8          ! 000
lens 8          ! 000
lens 9          ! 001
repeat 6        ! 11 1110
lens 9          ! 001
lens 8          ! 000
lens 10         ! 010
lens 9          ! 001
lens 9          ! 001
lens 9          ! 001
lens 8          ! 000
repeat 6        ! 11 1110
repeat 4        ! 01 1110
lens 9          ! 001
lens 9          ! 001
lens 10         ! 010
lens 10         ! 010
lens 10         ! 010
lens 8          ! 000
lens 9          ! 001
repeat 3        ! 00 1110
lens 10         ! 010
lens 9          ! 001
lens 10         ! 010
lens 10         ! 010
lens 9          ! 001
lens 10         ! 010
lens 9          ! 001
lens 10         ! 010
lens 9          ! 001
lens 8          ! 000
lens 9          ! 001
lens 8          ! 000
lens 8          ! 000
lens 7          ! 1101
lens 8          ! 000
lens 8          ! 000
lens 9          ! 001
lens 8          ! 000
lens 10         ! 010
lens 7          ! 1101
lens 9          ! 001
lens 8          ! 000
lens 12         ! 100
lens 9          ! 001
lens 10         ! 010
lens 10         ! 010
lens 10         ! 010
lens 8          ! 000
lens 7          ! 1101
repeat 4        ! 01 1110
lens 8          ! 000
lens 8          ! 000
lens 8          ! 000
lens 7          ! 1101
lens 9          ! 001

相关问题更多 >

编程相关推荐

热门问题

热门文章