有没有一种方法可以应用在句子级数据上训练的sklearn模型对较长的文档进行预测？

2024-10-03 11:19:35 发布

男 | 程序猿一只，喜欢编程写python代码。

我试图弄清楚我是否可以训练使用注释数据进行情感分析，其中每个数据点都是一个句子，然后在更大的数据上使用这些模型。当我尝试这样做时，出现了以下错误：尝试决策树模型时出错：

ValueError: Number of features of the model must match the input. Model n_features is 30 and input n_features is 75000

这一次是因为尝试了支持向量机：

ValueError: X.shape[1] = 75000 should be equal to 30, the number of features at training time

注：30是用来训练的所有标记句子的长度，75000是我试图预测的较大文档的长度

下面是做预测的代码，以防万一，虽然这是非常标准的，所以我觉得这并不意味着什么。我也可以粘贴培训代码，但我认为这与我的问题无关：

y_test_pred_10k = model_dt.predict()X_test10k
svm_predictions_test_10k = SVM.predict(X_test10k)

我不知道有什么办法，如果有的话，使这项工作除了填充句子长度75000，但这似乎不是一个好主意。还有其他选择吗

Tags： of the 数据代码模型 test input model

0条回答

目前没有回答