为什么我不能从子类访问XGBClassifier功能的重要性？

import xgboost as xgb from sklearn.ensemble import RandomForestClassifier class my_rf(RandomForestClassifier): def important_features(self, X): return super(RandomForestClassifier, self).feature_importances_ class my_xgb(xgb.XGBClassifier): def important_features(self, X): return super(xgb.XGBClassifier, self).feature_importances_ c1 = my_rf() c1.fit(X,y) c1.important_features(X) #works

import xgboost as xgb print "version:", xgb.__version__ c = xgb.XGBClassifier() c.fit(X_train.as_matrix(), y_train.label) print c.feature_importances_[:5] version: 0.4 [ 0.4039548 0.05932203 0.06779661 0.00847458 0. ]

2条回答

网友

1楼 · 编辑于 2024-09-26 17:51:16

输出显示代码在0.4版本上，repository tree of last stable version of 0.4x（已发布Jan 15, 2016）显示{a2}文件还没有{}。这个特性实际上是在this提交Feb 8, 2016中引入的。在

我克隆了当前的github存储库，从头构建并安装了xgboost，代码运行良好：

from sklearn import datasets
from sklearn.ensemble.forest import RandomForestClassifier
import xgboost as xgb
print "version:", xgb.__version__

class my_rf(RandomForestClassifier):
    def important_features(self, X):
        return super(RandomForestClassifier, self).feature_importances_ 

class my_xgb(xgb.XGBClassifier):
    def important_features(self, X):
        return super(xgb.XGBClassifier, self).feature_importances_

iris = datasets.load_iris()
X = iris.data
y = iris.target

c1 = my_rf()
c1.fit(X,y)
print c1.important_features(X)

c2 = my_xgb()
c2.fit(X,y)
print c2.important_features(X)

c3 = xgb.XGBClassifier()
c3.fit(X, y)
print c3.feature_importances_

输出：

^{pr2}$

编辑：

如果使用的是XGBRegressor，请确保在Dec 1, 2016之后克隆了存储库，因为根据this提交，此时{}被移动到base XGBModel进行{}访问。在

将此添加到上述代码中：

class my_xgb_regressor(xgb.XGBRegressor):
    def important_features(self, X):
        return super(xgb.XGBRegressor, self).feature_importances_

c4 = my_xgb_regressor()
c4.fit(X, y)
print c4.important_features(X)

输出：

version: 0.6
[ 0.0307026   0.01456868  0.45198349  0.50274523]
[ 0.17701453  0.11228534  0.41479525  0.29590487]
[ 0.17701453  0.11228534  0.41479525  0.29590487]
[ 0.25        0.17518248  0.34489051  0.229927  ]

网友

2楼 · 编辑于 2024-09-26 17:51:16

据我所知，feature_importances_在XGBoost中没有实现。您可以使用排列功能重要性之类的方法进行自己的滚动：

import random
from sklearn.cross_validation import cross_val_score

def feature_importances(clf, X, y):
    score = np.mean(cross_val_score(clf, X,y,scoring='roc_auc'))
    importances = {} 
    for i in range(X.shape[1]):
        X_perm = X.copy()
        X_perm[:,i] = random.sample(X[:,i].tolist(), X.shape[0])
        perm_score = np.mean(cross_val_score(clf, X_perm , y, scoring='roc_auc'))
        importances[i] = score - perm_score

    return importances

编辑：

相关问题更多 >

编程相关推荐

热门问题

热门文章