自动选择磁盘/系统的最佳块大小

2024-10-01 11:19:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下代码

def GetSHA256(filename, size = 2 ** 10):

    import hashlib

    h = hashlib.sha256()

    with open(filename, 'rb') as f:
        for byte_block in iter(lambda: f.read(size * h.block_size), b""):
            h.update(byte_block)
        return h.hexdigest()

我想选择一个最佳的块大小。然而,据我所知,人们倾向于手工优化。例如herehere。有什么办法可以做得更好吗?还是有一个图书馆考虑过这个问题


Tags: 代码importsizeheredefaswithopen
1条回答
网友
1楼 · 发布于 2024-10-01 11:19:52

你调查过io.DEFAULT_BUFFER_SIZE

根据open()的文档:

buffering is an optional integer used to set the buffering policy. Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size in bytes of a fixed-size chunk buffer. When no buffering argument is given, the default buffering policy works as follows:\

Binary files are buffered in fixed-size chunks; the size of the buffer is chosen using a heuristic trying to determine the underlying device’s “block size” and falling back on io.DEFAULT_BUFFER_SIZE. On many systems, the buffer will typically be 4096 or 8192 bytes long.

默认行为是buffering=-1,因此open()更有可能读取缓冲8192块中的文件

相关问题 更多 >