决策树回归器中的网格交叉验证问题

alphas = np.arange(0,2,0.1) pipe_tree = Pipeline(steps=[('scaler', scaler), ('pca', pca), ('tree', tree)], memory = 'tmp') treeCV = GridSearchCV(pipe_tree, dict( pca__n_components=n_components, tree__ccp_alpha=alphas ), cv=5, scoring ='r2', n_jobs=-1) start_time = time.time() treeCV.fit(X_train, y_train)

ValueError: Invalid parameter ccp_alpha for estimator Pipeline(memory='tmp', steps=[('scaler', StandardScaler()), ('pca', PCA()), ('tree', MultiOutputRegressor(estimator=DecisionTreeRegressor(random_state=0)))]). Check the list of available parameters with `estimator.get_params().keys()`.

1条回答

网友

1楼 · 发布于 2024-05-17 03:19:11

我不确定你的帖子里有什么内容，但它似乎是你决策树顶端的一个多重累加器。如果设置正确，它应该可以工作。首先，我们定义参数：

from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.pipeline import Pipeline
from sklearn.model_selection import GridSearchCV
from sklearn.tree import DecisionTreeClassifier
import numpy as np

alphas = np.arange(0,2,0.1)
n_components = [3,4,5]

然后设置以下步骤：

scaler = StandardScaler()
pca = PCA()
from sklearn.multioutput import MultiOutputRegressor
tree = MultiOutputRegressor(DecisionTreeClassifier())

玩具数据：

X_train = np.random.normal(0,1,(100,10))
y_train = np.random.binomial(1,0.5,(100,2))

管道：

pipe_tree = Pipeline(steps=[('scaler', scaler), ('pca', pca), ('tree', tree)])
tree.get_params()

{'estimator__ccp_alpha': 0.0,
 'estimator__class_weight': None,
 'estimator__criterion': 'gini',
 'estimator__max_depth': None,
 'estimator__max_features': None,
 'estimator__max_leaf_nodes': None,
 'estimator__min_impurity_decrease': 0.0,
 'estimator__min_impurity_split': None,
 'estimator__min_samples_leaf': 1,
 'estimator__min_samples_split': 2,
 'estimator__min_weight_fraction_leaf': 0.0,
 'estimator__presort': 'deprecated',
 'estimator__random_state': None,
 'estimator__splitter': 'best',
 'estimator': DecisionTreeClassifier(),
 'n_jobs': None}

参数应该是estimator__ccp_alpha。因此，如果我们在它前面加上tree，加上tree__estimator__ccp_alpha = alphas，它就会起作用：

treeCV = GridSearchCV(pipe_tree, dict( pca__n_components=n_components, tree__estimator__ccp_alpha=alphas ),
                      cv=5, scoring ='r2', n_jobs=-1)
treeCV.fit(X_train, y_train)

如果我用你的：

treeCV = GridSearchCV(pipe_tree, dict( pca__n_components=n_components, tree__ccp_alpha=alphas ),
                      cv=5, scoring ='r2', n_jobs=-1)

我也犯了同样的错误

相关问题更多 >

编程相关推荐

热门问题

热门文章