java删除Weka ML模型的测试ARFF文件中的最后一个类属性在预测模型中不起作用
基本上,我正在用Java(Weka)构建一个机器学习模型,以检测字符串中的一些模式。我有两个类属性,我试图让我的模型基于这些模式进行预测。当我将属性值保留在ARFF文件中时,我的代码可以工作,但当我取出它并在测试文件中用问号替换它时,代码不能工作。当我这样做时,它会在输出中给我所有相同的值(cfb)。我知道模型不是硬编码的,但出于测试目的,我想删除这些属性值。我已经构建了分类器并评估了模型
/**
* Make predictions based on that model. Improve the model
*
* @throws Exception
*/
public void modelPredictions(Instances trainedDataSet, Instances testedDataSet, Classifier classifierType) throws Exception {
// Get the number of classes
int numClasses = trainedDataSet.numClasses();
// print out class values in the training dataset
for (int i = 0; i < numClasses; i++) {
// get class string value using the class index
String classValue = trainedDataSet.classAttribute().value(i);
System.out.println("Class Value " + i + " is " + classValue);
}
// set class index to the last attribute
// loop through the new dataset and make predictions
System.out.println("===================");
System.out.println("Actual Class, NB Predicted");
for (int i = 0; i < testedDataSet.numInstances(); i++) {
// get class double value for current instance
double actualClass = testedDataSet.instance(i).classValue();
// get class string value using the class index using the class's int value
String actual = testedDataSet.classAttribute().value((int) actualClass);
// get Instance object of current instance
Instance newInst = testedDataSet.instance(i);
// call classifyInstance, which returns a double value for the class
double predNB = classifierType.classifyInstance(newInst);
// use this value to get string value of the predicted class
String predString = testedDataSet.classAttribute().value((int) predNB);
System.out.println(actual + ", " + predString);
}
}
Image of the test ARFF File (Sorry, was getting errors in pasting the file content of the file.
# 1 楼答案
如果将测试集中的实际类替换为问号,这些将被解释为缺少值。Weka中缺少的值用
Double.NaN
表示。将缺少的值(即Double.NaN)强制转换为int
将导致0
,这是类的第一个标称值。你的实际类永远是第一类标签以下代码:
输出如下: