使用liblinear(java)进行机器学习概率预测,在代码中直接使用分类器
考虑以下的线性(http://liblinear.bwaldvogel.de/)用法:
double C = 1.0; // cost of constraints violation
double eps = 0.01; // stopping criteria
Parameter param = new Parameter(SolverType.L2R_L2LOSS_SVC, C, eps);
Problem problem = new Problem();
double[] GROUPS_ARRAY = {1, 0, 0, 0};
problem.y = GROUPS_ARRAY;
int NUM_OF_TS_EXAMPLES = 4;
problem.l = NUM_OF_TS_EXAMPLES;
problem.n = 2;
FeatureNode[] instance1 = { new FeatureNode(1, 1), new FeatureNode(2, 1) };
FeatureNode[] instance2 = { new FeatureNode(1, -1), new FeatureNode(2, 1) };
FeatureNode[] instance3 = { new FeatureNode(1, -1), new FeatureNode(2, -1) };
FeatureNode[] instance4 = { new FeatureNode(1, 1), new FeatureNode(2, -1) };
FeatureNode[] instance5 = { new FeatureNode(1, 1), new FeatureNode(2, -0.1) };
FeatureNode[] instance6 = { new FeatureNode(1, -0.1), new FeatureNode(2, 1) };
FeatureNode[] instance7 = { new FeatureNode(1, -0.1), new FeatureNode(2, -0.1) };
FeatureNode[][] testSetWithUnknown = {
instance5,
instance6,
instance7
};
FeatureNode[][] trainingSetWithUnknown = {
instance1,
instance2,
instance3,
instance4
};
problem.x = trainingSetWithUnknown;
Model m = Linear.train(problem, param);
for( int i = 0; i < trainingSetWithUnknown.length; i++)
System.out.println(" Train.instance = " + i + " => " + Linear.predict(m, trainingSetWithUnknown[i]) );
System.out.println("---------------------");
for( int i = 0; i < testSetWithUnknown.length; i++)
System.out.println(" Test.instance = " + i + " => " + Linear.predict(m, testSetWithUnknown[i]) );
以下是输出:
iter 1 act 1.778e+00 pre 1.778e+00 delta 6.285e-01 f 4.000e+00 |g| 5.657e+00 CG 1
Train.instance = 0 => 1.0
Train.instance = 1 => 0.0
Train.instance = 2 => 0.0
Train.instance = 3 => 0.0
---------------------
Test.instance = 0 => 1.0
Test.instance = 1 => 1.0
Test.instance = 2 => 0.0
我需要的不是整数(硬)预测,而是概率预测。命令行中有一个选项-b,但我在代码中找不到任何直接使用该函数的选项。另外,查看代码内部(https://github.com/bwaldvogel/liblinear-java/blob/master/src/main/java/de/bwaldvogel/liblinear/Predict.java);显然,通过代码内部的直接使用,不存在概率预测的选项。对吗
更新:我最终使用了liblinear代码表单https://github.com/bwaldvogel/liblinear-java。在文件中。我变了
private static boolean flag_predict_probability = true;
到
private static boolean flag_predict_probability = false;
使用
SolverType.L2R_LR
但仍然得到整数类。有什么想法吗
# 1 楼答案
要使用概率,需要更改代码。这个预测是在实验室里做出的
公共静态双预测值(模型,特征[]x,双[]dec_值){
函数内部是线性的。java文件:
需要改成
请注意,输出仍然不是概率,而是权重和特征值的线性组合。如果你把它给softmax函数,它将成为[0,1]中的一个概率
此外,确保选择逻辑回归: