sklearn日志丢失不同数量的类

print pred.shape print np.unique(pred) print np.unique(pred).size (19191L,) [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37] 38 print true.shape print np.unique(true) print np.unique(true).size (19191L,) [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37] 38

3条回答

网友

1楼 · 编辑于 2024-05-17 05:41:31

从日志损失文档中：

y_pred : array-like of float, shape = (n_samples, n_classes) or (n_samples,)
Predicted probabilities, as returned by a classifier’s predict_proba method. If y_pred.shape = (n_samples,) the probabilities provided are assumed to be that of the positive class. The labels in y_pred are assumed to be ordered alphabetically, as done by preprocessing.LabelBinarizer.

你需要传递概率而不是预测标签。

网友

2楼 · 编辑于 2024-05-17 05:41:31

很简单，你用的是预测，而不是预测的概率。pred变量包含

[ 1 2 1 3 .... ] #Classes : 1, 2 or 3

但是要使用日志损失它应该包含如下内容：

 [[ 0.1, 0.8, 0.1] [ 0.0, 0.79 , 0.21] .... ] #each element is an array with probability of each class

要获得这些概率，请使用函数predict_proba：

pred = model.predict_proba(x_test)
eval = log_loss(y_true,pred)

网友

3楼 · 编辑于 2024-05-17 05:41:31

在log_loss方法中，真正的数组由一个LabelBinarizer进行拟合和变换，该LabelBinarizer改变了数组的维数。因此，检查true和pred具有相似的维度并不意味着log_loss方法会工作，因为true的维度发生了变化。如果您只是有二进制类，我建议您使用这个log_loss cost函数，否则对于多个类，这个方法不起作用。

相关问题更多 >

编程相关推荐

热门问题

热门文章