我读了两天的校准方法,但并没有真正做到这一点。有两种类型的校准
将每个预测值的预测值与预测值的平均值(p)绘制成正数
等张回归-从数学上讲,它试图通过二次规划来拟合加权最小二乘法,下一个观测值相对于前一个观测值总是非递减的。
我编写了一个基于logistic回归的python模块(虽然我知道LogisticRegression
默认情况下返回经过良好校准的预测,因为它直接优化了日志丢失,我构建它是为了检查我的理解)
import numpy as np
import pandas as pd
from sklearn import linear_model
from sklearn.calibration import CalibratedClassifierCV
from sklearn.model_selection import train_test_split
from sklearn.metrics import log_loss
from pandas import DataFrame
class logistic_Calibration:
def __init__(self, data, response):
self.data = data
self.response = response
def Calibration(self):
Xtrain, Xtest, ytrain, ytest = train_test_split(self.data, self.response, test_size=0.20, random_state=36)
logreg = linear_model.LogisticRegression()
logreg.fit(Xtrain, np.array(ytrain).flatten())
PredWO_calibration = logreg.predict_proba(Xtest)
lossWO_calibration = log_loss(ytest, PredWO_calibration)
clf_sigmoid = CalibratedClassifierCV(logreg, cv=5, method='sigmoid')
clf_sigmoid.fit(Xtrain, np.array(ytrain).flatten())
PredWITH_calibration = clf_sigmoid.predict_proba(Xtest)
lossWITH_calibration = log_loss(ytest, PredWITH_calibration)
Loss_difference_WO_minus_W = lossWO_calibration - lossWITH_calibration
return [lossWO_calibration, lossWITH_calibration, Loss_difference_WO_minus_W]
但我仍然不清楚以下几点
请引导。在
目前没有回答
相关问题 更多 >
编程相关推荐