scikitlearn中递归特征消除（RFE）的排序与得分

from sklearn.datasets import make_friedman1 from sklearn.feature_selection import RFECV from sklearn.svm import SVR X, y = make_friedman1(n_samples=50, n_features=10, random_state=0) estimator = SVR(kernel="linear") selector = RFECV(estimator, step=1, cv=5) selector = selector.fit(X, y) selector.support_ array([ True, True, True, True, True, False, False, False, False, False], dtype=bool) selector.ranking_ array([1, 1, 1, 1, 1, 6, 4, 3, 2, 5])

1条回答

网友
1楼 · 发布于 2024-09-28 21:39:42

您是正确的，因为排名较低的值表示一个好的特性，并且grid_scores_属性中的高交叉验证分数也很好，但是您误解了grid_scores_中的值的含义。来自RFECV文档
grid_scores_ array of shape [n_subsets_of_features] The cross-validation scores such that grid_scores_[i] corresponds to the CV score of the i-th subset of features.
因此grid_scores_值与特定特征不对应，它们是特征的子集的交叉验证错误度量。在这个例子中，具有5个特征的子集是信息量最大的集合，因为grid_scores_中的第5个值（包含5个最高级特征的SVR模型的CV值）是最大的。
您还应该注意，由于没有显式指定评分标准，因此使用的评分器是SVR的默认值，即R^2，而不是准确性（这仅对分类器有意义）。

文档中的图：

相关问题更多 >

编程相关推荐

热门问题

热门文章