如何在XGBRegressionr的MultiOutputRegressionr上使用验证集?

2024-09-27 04:20:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用以下多输出分离器:

from xgboost import XGBRegressor
from sklearn.multioutput import MultiOutputRegressor

#Define the estimator
estimator = XGBRegressor(
    objective = 'reg:squarederror'
    )

# Define the model
my_model = MultiOutputRegressor(estimator = estimator, n_jobs = -1).fit(X_train, y_train)

我希望使用验证集来评估XGBRegressionr的性能,但是我认为MultiOutputRegressor不支持将eval_set传递给fit函数

在这种情况下,如何使用验证集?是否有任何变通方法可以调整XGBRegressionor以获得多个输出


Tags: thefromimportmodeltrainsklearnfitestimator
1条回答
网友
1楼 · 发布于 2024-09-27 04:20:52

您可以尝试编辑MultiOutputRegressor对象的fit方法,如下所示:

from sklearn.utils.validation import _check_fit_params
from sklearn.base import is_classifier
from sklearn.utils.fixes import delayed
from joblib import Parallel
from sklearn.multioutput import _fit_estimator

class MyMultiOutputRegressor(MultiOutputRegressor):
    
    def fit(self, X, y, sample_weight=None, **fit_params):
        """ Fit the model to data.
        Fit a separate model for each output variable.
        Parameters
             
        X : {array-like, sparse matrix} of shape (n_samples, n_features)
            Data.
        y : {array-like, sparse matrix} of shape (n_samples, n_outputs)
            Multi-output targets. An indicator matrix turns on multilabel
            estimation.
        sample_weight : array-like of shape (n_samples,), default=None
            Sample weights. If None, then samples are equally weighted.
            Only supported if the underlying regressor supports sample
            weights.
        **fit_params : dict of string -> object
            Parameters passed to the ``estimator.fit`` method of each step.
            .. versionadded:: 0.23
        Returns
           -
        self : object
        """

        if not hasattr(self.estimator, "fit"):
            raise ValueError("The base estimator should implement"
                             " a fit method")

        X, y = self._validate_data(X, y,
                                   force_all_finite=False,
                                   multi_output=True, accept_sparse=True)

        if is_classifier(self):
            check_classification_targets(y)

        if y.ndim == 1:
            raise ValueError("y must have at least two dimensions for "
                             "multi-output regression but has only one.")

        if (sample_weight is not None and
                not has_fit_parameter(self.estimator, 'sample_weight')):
            raise ValueError("Underlying estimator does not support"
                             " sample weights.")

        fit_params_validated = _check_fit_params(X, fit_params)
        [(X_test, Y_test)] = fit_params_validated.pop('eval_set')
        self.estimators_ = Parallel(n_jobs=self.n_jobs)(
            delayed(_fit_estimator)(
                self.estimator, X, y[:, i], sample_weight,
                **fit_params_validated, eval_set=[(X_test, Y_test[:, i])])
            for i in range(y.shape[1]))
        return self

然后将eval_set传递给fit方法:

fit_params = dict(
        eval_set=[(X_test, Y_test)], 
        early_stopping_rounds=10
        )
model.fit(X_train, Y_train, **fit_params)

相关问题 更多 >

    热门问题