scikit-klearn中的特征联合与不相容行维数

2条回答

网友

1楼 · 编辑于 2024-09-30 01:19:59

好吧，我会再解释一下。当我说做某事的时候，我说不要对X做任何事。如果我重写的话

def transform (self, X, **transform_params):
    print(X.shape)   #Print X shape on first line before do anything
    print(type(X))   #For information
    do_nothing_withX #Construct a new matrix with a shape (number of samples, 30 new features) 
    x_new_feat = np.array(list_feat) #Get my new matrix in numpy array 
    print(x_new_feat.shape) 
    return x_new_feat

在上面的这个转换例子中，我不连接X矩阵和新矩阵。我想功能联盟会这么做。。。我的结果是：

^{pr2}$

进一步说，如果我对gridsearchCV进行交叉验证，只需修改样本大小：

grid_search = GridSearchCV(pipeline, parameters, cv=2, n_jobs = 1, verbose = 20)

我有这样的结果：

486 486
Fitting 2 folds for each of 3456 candidates, totalling 6912 fits
[CV] ......
(242, 3000) #This a new sample size due to cross validation
<class 'scipy.sparse.csr.csr_matrix'>
(486, 30)
..........
ValueError: blocks[0,:] has incompatible row dimensions

当然，如果有必要的话，我可以给你所有的代码。但我不明白的是，这就是为什么使用管道countvectorizer+tdf_idf的样本大小不等于加载的文件数sklearn.datasets.load_文件（）功能。在

网友

2楼 · 编辑于 2024-09-30 01:19:59

你现在可能已经解决了，但其他人可能也有同样的问题：

(323, 3000) # X shape Matrix
<class 'scipy.sparse.csr.csr_matrix'>

AddNed试图将一个矩阵与一个稀疏矩阵相连接，稀疏矩阵应首先转换为密集矩阵。我在尝试使用CountVectorizer的结果时发现了相同的错误

相关问题更多 >

编程相关推荐

热门问题

热门文章

scikit-klearn中的特征联合与不相容行维数

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >