如何读取带有长度标头的UTF16BE编码字节

2条回答

网友

1楼 · 编辑于 2024-06-26 10:52:19

目前我是这样做的，但不知怎么的，我在想象雷蒙德·海廷格的"There must be a better way!"

import io
import functools
from typing import ByteString
from typing import Iterable

# Decoders
int_BE = functools.partial(int.from_bytes, byteorder="big")
utf16_BE = functools.partial(bytes.decode, encoding="utf_16_be")

encoded_strings = b"\x00\x05\x00H\x00e\x00l\x00l\x00o\x00\x06\x00W\x00o\x00r\x00l\x00d\x00!"
header_length = 2

def decode_strings(byte_string: ByteString) -> Iterable[str]:
    stream = io.BytesIO(byte_string)
    while True:
        length = int_BE(stream.read(header_length))
        if length:
            text = utf16_BE(stream.read(length * 2))
            yield text
        else:
            break
    stream.close()


if __name__ == "__main__":
    for text in decode_strings(encoded_strings):
        print(text)

谢谢你的建议

网友

2楼 · 编辑于 2024-06-26 10:52:19

这不是一个很大的改进，但是您的代码可以简化一点

def decode_strings(byte_string: ByteString) -> Generator[str]:
    with io.BytesIO(byte_string) as stream:
        while (s := stream.read(2)):
            length = int.from_bytes(s, byteorder="big")
            yield bytes.decode(stream.read(length), encoding="utf_16_be")

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何读取带有长度标头的UTF16BE编码字节

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >