Pandas与pickle 0.14.1和0.15.2的向后兼容性问题

2024-10-02 12:22:49 发布

您现在位置:Python中文网/ 问答频道 /正文

我们使用pandas Dataframe作为时间序列数据的主要数据容器。我们将数据帧打包成二进制blob,放入mongoDB文档中进行存储,并将其与有关时间序列blob的元数据的键一起打包。在

当我们从pandas 0.14.1升级到0.15.2时遇到了一个错误。在

创建熊猫数据帧的二进制blob(0.14.1)

import lz4   
import cPickle

bd = lz4.compress(cPickle.dumps(df,cPickle.HIGHEST_PROTOCOL))

错误案例:用pandas 0.15.2从mongoDB读回

^{pr2}$

成功案例:用pandas 0.14.1从mongoDB读回,没有错误。在

这似乎类似于旧的堆栈线程Pandas compiled from source: default pickle behavior changed 来自https://stackoverflow.com/users/644898/jeff的有用注释

The error message you are seeing `TypeError: _reconstruct: First argument must be a sub-type of ndarray is that the python default unpickler makes sure that the class hierarchy that was pickled is exactly the same what it is recreating. Since Series has changed between versions this is no longer possible with the default unpickler, (this IMHO is a bug in the way pickle works). In any event, pandas will unpickle pre-0.13 pickles that have Series objects."

有什么解决办法或解决方案吗?在

要重新创建错误:

熊猫0.14.1环境中的设置:

df = pd.DataFrame(np.random.randn(10,10))
cPickle.dump(df,open("cp0141.p","wb"))
cPickle.load(open('cp0141.p','r')) # no error

在pandas 0.15.2环境中创建错误:

cPickle.load(open('cp0141.p','r'))
TypeError: ('_reconstruct: First argument must be a sub-type of ndarray', <built-in function_reconstruct>, (<class 'pandas.core.index.Int64Index'>, (0,), 'b'))

Tags: the数据defaultpandasdfthatismongodb
1条回答
网友
1楼 · 发布于 2024-10-02 12:22:49

这被明确地称为Index类现在不再是子类ndarray,而是一个pandas对象,参见here。在

您只需使用pd.read_pickle来读取pickle。在

相关问题 更多 >

    热门问题