如何在Python中“保存”IsolationForest模型？

2条回答

网友

1楼 · 编辑于 2024-06-25 23:09:51

https://docs.python.org/2/library/pickle.html

使用Pickle库。在

适合你的模型。在

用pickle.dump(obj, file[, protocol])保存它

用pickle.load(file)加载它

对你的异常值进行分类

网友

2楼 · 编辑于 2024-06-25 23:09:51

sklearn估计器实现一些方法，使您更容易保存估计器的相关训练属性。有些估计器自己实现__getstate__方法，但是其他一些，比如GMM只使用base implementation，它只保存对象的内部字典：

def __getstate__(self):
    try:
        state = super(BaseEstimator, self).__getstate__()
    except AttributeError:
        state = self.__dict__.copy()

    if type(self).__module__.startswith('sklearn.'):
        return dict(state.items(), _sklearn_version=__version__)
    else:
        return state

将模型保存到光盘的推荐方法是使用^{}模块：

^{pr2}$

但是，您应该保存额外的数据，以便将来可以重新训练您的模型，否则将遭受严重后果（例如被锁定到旧版本的sklearn）。在

从documentation：

In order to rebuild a similar model with future versions of scikit-learn, additional metadata should be saved along the pickled model:
The training data, e.g. a reference to a immutable snapshot
The python source code used to generate the model
The versions of scikit-learn and its dependencies
The cross validation score obtained on the training data

尤其是在cysk6>中，它保证了在cysk6>之间的耦合是不稳定的。它在过去看到了向后不兼容的变化。在

如果您的模型变得非常大并且加载变得很麻烦，您还可以使用更高效的joblib。根据文件：

In the specific case of the scikit, it may be more interesting to use joblib’s replacement of pickle (joblib.dump & joblib.load), which is more efficient on objects that carry large numpy arrays internally as is often the case for fitted scikit-learn estimators, but can only pickle to the disk and not to a string:

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何在Python中“保存”IsolationForest模型？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >