sklearn:使用Pipeline和TransformedTargetRecessor缩放x(数据)和y(目标)

2024-09-28 02:00:13 发布

您现在位置:Python中文网/ 问答频道 /正文

我想使用Pipeline和TransformedTargetGressor来处理所有的缩放(在数据和目标上):这可以混合Pipeline和TransformedTargetGressor吗?如何从转换的目标整合器中获得结果

$ cat test_ttr.py
#!/usr/bin/python
# -*- coding: UTF-8 -*-

from sklearn.datasets import make_regression
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from sklearn import linear_model
from sklearn.pipeline import Pipeline
from sklearn.compose import TransformedTargetRegressor

def main():
    x, y = make_regression()

    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

    model = linear_model.Ridge(alpha=1)

    pipe = Pipeline([('scale', preprocessing.StandardScaler()), ('model', model)])
    treg = TransformedTargetRegressor(regressor=pipe, transformer=preprocessing.MinMaxScaler())

    treg.fit(x_train, y_train)

    print(pipe.get_params()['model__alpha']) # OK !
    print(treg.get_params()['regressor__model__coef']) # KO ?!

if __name__ == '__main__':
    main()

但无法从TransformedTargetRegressor获得结果(例如coefs)

1
Traceback (most recent call last):
  File ".\test_ttr.py", line 26, in <module>
    main()
  File ".\test_ttr.py", line 23, in main
    print(treg.get_params()['regressor__model__coef']) # KO ?!
TypeError: 'TransformedTargetRegressor' object is not subscriptable

Tags: frompytestimportmodelpipelinemaintrain
2条回答

错误发生在您的行中

print(treg.get_params()['regressor__model__coef']) # KO ?!

因为TransformedTargetRegressor没有参数'regressor__model__coef'

您可以通过执行treg.get_params()查看所有可用参数,然后返回:

{'check_inverse': True,
 'func': None,
 'inverse_func': None,
 'regressor': Pipeline(memory=None,
          steps=[('scale',
                  StandardScaler(copy=True, with_mean=True, with_std=True)),
                 ('model',
                  Ridge(alpha=1, copy_X=True, fit_intercept=True, max_iter=None,
                        normalize=False, random_state=None, solver='auto',
                        tol=0.001))],
          verbose=False),
 'regressor__memory': None,
 'regressor__model': Ridge(alpha=1, copy_X=True, fit_intercept=True, max_iter=None, normalize=False,
       random_state=None, solver='auto', tol=0.001),
 'regressor__model__alpha': 1,
 'regressor__model__copy_X': True,
 'regressor__model__fit_intercept': True,
 'regressor__model__max_iter': None,
 'regressor__model__normalize': False,
 'regressor__model__random_state': None,
 'regressor__model__solver': 'auto',
 'regressor__model__tol': 0.001,
 'regressor__scale': StandardScaler(copy=True, with_mean=True, with_std=True),
 'regressor__scale__copy': True,
 'regressor__scale__with_mean': True,
 'regressor__scale__with_std': True,
 'regressor__steps': [('scale',
   StandardScaler(copy=True, with_mean=True, with_std=True)),
  ('model',
   Ridge(alpha=1, copy_X=True, fit_intercept=True, max_iter=None, normalize=False,
         random_state=None, solver='auto', tol=0.001))],
 'regressor__verbose': False,
 'transformer': MinMaxScaler(copy=True, feature_range=(0, 1)),
 'transformer__copy': True,
 'transformer__feature_range': (0, 1)}

您可以通过使用获得结果,例如R2分数

treg.score(x_test, y_test)

返回

0.7506837388137267

要预测,可以使用

treg.predict(x_test)

该文档非常有用,您可以仔细阅读herehere

我找到的最佳解决方案(不确定直接访问成员是否很好):

$ cat test_ttr.py
#!/usr/bin/python
# -*- coding: UTF-8 -*-

from sklearn.datasets import make_regression
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from sklearn import linear_model
from sklearn.pipeline import Pipeline
from sklearn.compose import TransformedTargetRegressor

def main():
    x, y = make_regression()

    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

    model = linear_model.Ridge(alpha=1)

    pipe = Pipeline([('scale', preprocessing.StandardScaler()), ('model', model)])
    treg = TransformedTargetRegressor(regressor=pipe, transformer=preprocessing.MinMaxScaler())

    treg.fit(x_train, y_train)

    print(treg.regressor_['model'].coef_)
    print(treg.regressor_['model'].alpha)

if __name__ == '__main__':
    main()


$ python test_ttr.py
[-1.13077347e-02  4.44189754e-03  2.39262548e-03  1.72868998e-02
  9.98554629e-03  4.66877821e-02 -4.25349208e-03  1.94027088e-03
  5.64007062e-05  3.08491096e-03 -3.50818087e-05 -1.11165790e-02
 -6.67893402e-03 -3.01372675e-03  3.70455557e-03  5.05148384e-03
  9.39056280e-03  5.63774373e-03 -4.07545049e-03 -5.98363493e-03
 -8.21146459e-03  1.20560099e-02  5.79147139e-03 -3.87135045e-03
  3.62289162e-03 -5.32527728e-03  1.05227189e-02 -3.32636550e-03
  2.24062002e-02  5.36611024e-03  4.42517510e-03  2.98492436e-04
 -3.48722166e-03 -8.16323005e-03 -1.74921354e-03 -2.47793718e-03
  2.00056722e-02  9.02842425e-03 -4.22978758e-03  2.37737450e-03
 -7.93388529e-03  1.22910175e-02  1.34225568e-03 -3.51697078e-03
  4.20992326e-03  4.35675123e-03 -8.07619773e-04  1.13628592e-02
  4.12219590e-03  6.92190818e-03 -2.44482599e-03 -3.12429604e-03
 -5.43930166e-03  3.27253280e-02  4.11909724e-03  3.83302056e-03
  1.34754164e-02 -8.62591922e-04 -4.14770516e-03 -7.02794996e-03
 -2.04141679e-03 -8.93807591e-04 -1.50736158e-03  3.51801088e-03
 -1.26757035e-02 -8.46096567e-04  6.70465585e-02 -1.12191639e-02
  6.08120935e-03 -9.07017386e-03 -2.13280853e-03 -2.24764380e-03
  6.98012623e-03 -9.26042982e-03 -2.93708218e-03  5.74605237e-04
 -1.41308272e-03  5.24419314e-03  3.41054848e-02  7.80090716e-03
  7.33259527e-02 -4.78241365e-03  2.38806342e-04  3.84449219e-04
  5.49127586e-02 -6.91505707e-04 -4.14642042e-04  3.43961614e-03
  5.20966922e-04 -5.47828158e-03 -7.04740862e-04  4.68760531e-02
  4.12140344e-03 -5.16221700e-03 -7.35235898e-03  7.68674585e-03
 -4.39094201e-03  5.05034775e-03  5.75523532e-03 -6.17177294e-03]
1

如果可能的话,请随时改进这个答案

相关问题 更多 >

    热门问题