如何在CatBoost中实现自定义多类目标函数?

2024-09-27 09:23:39 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试按照中的指南实现多分类的自定义目标函数: https://catboost.ai/docs/concepts/python-usages-examples.html#custom-objective-function

我想知道calc_ders_multi()中定义的梯度和Hessian矩阵的预期格式是什么。在

为了了解它是如何工作的,我试图重现多类损失函数,但是在定义了梯度和Hessian矩阵的情况下,程序启动并卡住了,没有抛出任何错误。如果我不定义Hessian(return[]),程序运行时不会出现问题。在

import numpy as np
from catboost import Pool, CatBoostClassifier


class MultiClassObjective(object):
    def calc_ders_multi(self, approxes, targets, weights):
        # approxes, targets, weights are indexed containers of floats
        # (containers with only __len__ and __getitem__ defined).
        # weights parameter can be None.
        n_classes = len(approxes)
        exponents = np.exp(approxes)
        softmax = exponents / np.sum(exponents)
        targets = np.array([1 if x == targets else 0 for x in range(n_classes)])

        der1 = targets - softmax
        der2 = np.zeros((n_classes, n_classes))
        for x in range(n_classes):
            for y in range(n_classes):
                if x == y:
                    der2[x, y] = -softmax[x] * (1 - softmax[y])
                else:
                    der2[x, y] = softmax[x] * softmax[y]

        if weights is not None:
            der1 *= weights
            der2 *= weights

        return der1, der2
        # return der1, []

train_data = [[1, 4, 5, 6],
              [4, 5, 6, 7],
              [11, 20, 30, 30],
              [30, 40, 50, 60],
              [100, 300, 200, 400]]

train_labels = [0, 0, 1, 1, 2]

train_data = Pool(data=train_data , label=train_labels )

# Initialize CatBoostClassifier with custom `loss_function`
model = CatBoostClassifier(loss_function=MultiClassObjective(),
                           eval_metric="MultiClass",
                           iterations=100,
                           random_seed=0,
                           verbose=True
                          )
# Fit model
model.fit(train_data)

Tags: datareturn定义npfunctiontrainclasseshessian

热门问题