sklearn LinearSVC-X每个示例有1个功能;预期为5个

2024-05-17 06:59:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图预测测试数组的类,但得到以下错误以及堆栈跟踪:

Traceback (most recent call last):
  File "/home/radu/PycharmProjects/Recommender/Temporary/classify_dict_test.py", line 24, in <module>
    print classifier.predict(test)
  File "/home/radu/.local/lib/python2.7/site-packages/sklearn/linear_model/base.py", line 215, in predict
    scores = self.decision_function(X)
  File "/home/radu/.local/lib/python2.7/site-packages/sklearn/linear_model/base.py", line 196, in decision_function
    % (X.shape[1], n_features))
ValueError: X has 1 features per sample; expecting 5

生成此项的代码是:

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.svm import LinearSVC

corpus = [
    "I am super good with Java and JEE",
    "I am super good with .NET and C#",
    "I am really good with Python and R",
    "I am really good with C++ and pointers"
    ]

classes = ["java developer", ".net developer", "data scientist", "C++ developer"]

test = ["I think I'm a good developer with really good understanding of .NET"]

tvect = TfidfVectorizer(min_df=1, max_df=1)

X = tvect.fit_transform(corpus)

classifier = LinearSVC()
classifier.fit(X, classes)

print classifier.predict(test)

我试过在LinearSVC documentation中寻找可能引发此错误的指导或提示,但我无法找出原因。

非常感谢您的帮助!


Tags: andinpytestdeveloperhomewithline
1条回答
网友
1楼 · 发布于 2024-05-17 06:59:18

变量测试是一个字符串-SVC需要一个维数与X相同的特征向量。在将测试字符串馈送给SVC之前,必须使用相同的矢量器实例将其转换为特征向量:

X_test=tvect.transform(test)
classifier.predict(X_test)

相关问题 更多 >