pickle在大文件上随机失败吗？

2024-09-21 05:44:50 发布

您现在位置：Python中文网/ 问答频道 /正文

4307

网友

男 | 程序猿一只，喜欢编程写python代码。

问题陈述

我正在使用python3并试图编写一个IntervalTrees字典，它的重量大约为2到3gb。这是我的控制台输出：

10:39:25 - project: INFO - Checking if motifs file was generated by pickle...
10:39:25 - project: INFO -   - Motifs file does not seem to have been generated by pickle, proceeding to parse...
10:39:38 - project: INFO -   - Parse complete, constructing IntervalTrees...
11:04:05 - project: INFO -   - IntervalTree construction complete, saving pickle file for next time.
Traceback (most recent call last):
  File "/Users/alex/Documents/project/src/project.py", line 522, in dict_of_IntervalTree_from_motifs_file
    save_as_pickled_object(motifs, output_dir + 'motifs_IntervalTree_dictionary.pickle')
  File "/Users/alex/Documents/project/src/project.py", line 269, in save_as_pickled_object
    def save_as_pickled_object(object, filepath): return pickle.dump(object, open(filepath, "wb"))
OSError: [Errno 22] Invalid argument

我尝试保存的行是

^{pr2}$

错误可能在调用save_as_pickled_object15分钟后出现（11:20）。在

我用motif文件的一个小得多的小节尝试过，它运行得很好，所有代码都是完全相同的，所以这一定是规模问题。在python3.6中，pickle是否存在与您尝试pickle的规模有关的已知bug？一般来说，酸洗大文件是否存在已知的错误？有什么已知的方法可以解决这个问题吗？在

谢谢！在

更新：此问题可能重复Python 3 - Can pickle handle byte objects larger than 4GB?

解决方案

这是我用的代码。在

def save_as_pickled_object(obj, filepath):
    """
    This is a defensive way to write pickle.write, allowing for very large files on all platforms
    """
    max_bytes = 2**31 - 1
    bytes_out = pickle.dumps(obj)
    n_bytes = sys.getsizeof(bytes_out)
    with open(filepath, 'wb') as f_out:
        for idx in range(0, n_bytes, max_bytes):
            f_out.write(bytes_out[idx:idx+max_bytes])


def try_to_load_as_pickled_object_or_None(filepath):
    """
    This is a defensive way to write pickle.load, allowing for very large files on all platforms
    """
    max_bytes = 2**31 - 1
    try:
        input_size = os.path.getsize(filepath)
        bytes_in = bytearray(0)
        with open(filepath, 'rb') as f_in:
            for _ in range(0, input_size, max_bytes):
                bytes_in += f_in.read(max_bytes)
        obj = pickle.loads(bytes_in)
    except:
        return None
    return obj

Tags： to in info project for bytes object save

1条回答

网友

1楼 · 发布于 2024-09-21 05:44:50

亚历克斯，如果我没弄错的话，这个错误报告完美地描述了你的问题。在

http://bugs.python.org/issue24658

作为一种解决方法，我认为您可以pickle.dumps而不是{}，然后以小于2**31的块大小写入文件。在

pickle在大文件上随机失败吗？

问题陈述

更新：此问题可能重复Python 3 - Can pickle handle byte objects larger than 4GB?

解决方案

相关问题更多 >

编程相关推荐

热门问题

热门文章

pickle在大文件上随机失败吗？

问题陈述

更新：此问题可能重复Python 3 - Can pickle handle byte objects larger than 4GB?

解决方案

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >