Scikit F评分器

[Input] X_training, y_training, X_test, y_test = generate_datasets(df_X, df_y, 0.6) logistic = LogisticRegressionCV( Cs=50, cv=4, penalty='l2', fit_intercept=True, scoring='f1' ) logistic.fit(X_training, y_training) print('Predicted: %s' % str(logistic.predict(X_test))) print('F1-score: %f'% f1_score(y_test, logistic.predict(X_test))) print('Accuracy score: %f'% logistic.score(X_test, y_test)) [Output] >> Predicted: [0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0] >> Actual: [0 0 0 1 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 0 1 1] >> F1-score: 0.285714 >> Accuracy score: 0.782609 >> C:\Anaconda3\lib\site-packages\sklearn\metrics\classification.py:958: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.

2条回答

网友

1楼 · 编辑于 2024-10-01 13:29:48

这似乎是一个已知的错误here已经修复，我想您应该尝试更新sklearn。

网友

2楼 · 编辑于 2024-10-01 13:29:48

However, can anybody explain the meaning of the "UndefinedMetricWarning" warning that I am seeing? What is actually happening behind the curtains?

这在https://stackoverflow.com/a/34758800/1587329中有很好的描述：

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/metrics/classification.py
F1 = 2 * (precision * recall) / (precision + recall)
精度=TP/（TP+FP），正如你刚才所说的，如果预测值不是完全预测正类-精度为0。
回忆=TP/（TP+FN），如果预测者没有预测阳性类-TP为0-调用为0。
现在你把0/0除以。

要解决权重问题（分类器很容易（几乎）总是预测更流行的类），可以使用class_weight="balanced"：

logistic = LogisticRegressionCV(
    Cs=50,
    cv=4,
    penalty='l2', 
    fit_intercept=True,
    scoring='f1',
    class_weight="balanced"
)

^{}说：

The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y)).

相关问题更多 >

编程相关推荐

热门问题

热门文章