如何定义多输出回归问题的方差加权参数?

2024-09-28 03:21:42 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用RandomForestRegressor建立我的回归模型,在该模型中有30列输入,5列输出,我做了一个列车测试分割来测量模型性能

from sklearn.ensemble import RandomForestRegressor
rfreg = RandomForestRegressor()
rfreg.fit(X_train, Y_train)
predict = rfreg.predict(X_test)
rfreg.score(X_test, Y_test)

但是,出现了一个错误,它告诉我手动定义值度量。r2_分数,否则,它对模型使用“统一_平均值”

C:\Users\X\anaconda3\lib\site-packages\sklearn\base.py:420: FutureWarning: The default value of multioutput (not exposed in score method) will change from 'variance_weighted' to 'uniform_average' in 0.23 to keep consistent with 'metrics.r2_score'. To specify the default value manually and avoid the warning, please either call 'metrics.r2_score' directly or make a custom scorer with 'metrics.make_scorer' (the built-in scorer 'r2' uses multioutput='uniform_average').
  "multioutput='uniform_average').", FutureWarning)

我查看了sklearn网站上提供的RandomForestRegressionor.score函数的手册和源代码:https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegressor.html#sklearn.ensemble.RandomForestRegressor.score

        from .metrics import r2_score
        from .metrics._regression import _check_reg_targets
        y_pred = self.predict(X)
        # XXX: Remove the check in 0.23
        y_type, _, _, _ = _check_reg_targets(y, y_pred, None)
        if y_type == 'continuous-multioutput':
            warnings.warn("The default value of multioutput (not exposed in "
                          "score method) will change from 'variance_weighted' "
                          "to 'uniform_average' in 0.23 to keep consistent "
                          "with 'metrics.r2_score'. To specify the default "
                          "value manually and avoid the warning, please "
                          "either call 'metrics.r2_score' directly or make a "
                          "custom scorer with 'metrics.make_scorer' (the "
                          "built-in scorer 'r2' uses "
                          "multioutput='uniform_average').", FutureWarning)
        return r2_score(y, y_pred, sample_weight=sample_weight,
                        multioutput='variance_weighted')

但我仍然不知道如何定义 multioutput='variance_weighted' 在里面 rfreg.score(X_test, Y_test)就我而言

此外,我想知道如果我将值从multioutput='uniform_weighted'更改为multioutput='variance_weighted',模型性能是否会提高?另外,如何确定每个输出列的权重?谢谢


Tags: theinfromtestuniformsklearnmetricsscore

热门问题