Sklearn SVC with MNIST数据集：数字5是否始终错误？

import numpy as np from sklearn.model_selection import train_test_split from sklearn import datasets from sklearn.svm import SVC from sklearn.metrics import confusion_matrix data = datasets.load_digits() images = data.images targets = data.target # Split into train and test sets images_train, images_test, imlabels_train, imlabels_test = train_test_split(images, targets, test_size=.2, shuffle=False) # Re-shape data so that it's 2D images_train = np.reshape(images_train, (np.shape(images_train)[0], 64)) images_test = np.reshape(images_test, (np.shape(images_test)[0], 64)) svm_classifier = SVC(gamma='auto').fit(images_train, imlabels_train) number_correct_svc = 0 preds = [] for label_index in range(len(imlabels_test)): pred = svm_classifier.predict(images_test[label_index].reshape(1,-1)) if pred[0] == imlabels_test[label_index]: number_correct_svc += 1 preds.append(pred[0]) print("Support Vector Classifier...") print(f"\tPercent correct for all test data: {100*number_correct_svc/len(imlabels_test)}%") confusion_matrix(preds,imlabels_test)

array([[22, 0, 0, 0, 0, 0, 0, 0, 0, 0], [ 0, 15, 0, 0, 0, 0, 0, 0, 0, 0], [ 0, 0, 15, 0, 0, 0, 0, 0, 0, 0], [ 0, 0, 0, 21, 0, 0, 0, 0, 0, 0], [ 0, 0, 0, 0, 21, 0, 0, 0, 0, 0], [13, 21, 20, 16, 16, 37, 23, 20, 31, 16], [ 0, 0, 0, 0, 0, 0, 14, 0, 0, 0], [ 0, 0, 0, 0, 0, 0, 0, 16, 0, 0], [ 0, 0, 0, 0, 0, 0, 0, 0, 2, 0], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 21]], dtype=int64)

更新：

我尝试使用SCV（gamma='scale'），它看起来更合理。知道为什么“自动”不起作用还是很好的？按比例：

array([[34, 0, 0, 0, 0, 0, 0, 0, 0, 0], [ 0, 36, 0, 0, 0, 0, 0, 0, 1, 0], [ 0, 0, 35, 0, 0, 0, 0, 0, 0, 0], [ 0, 0, 0, 27, 0, 0, 0, 0, 0, 1], [ 1, 0, 0, 0, 34, 0, 0, 0, 0, 0], [ 0, 0, 0, 2, 0, 37, 0, 0, 0, 1], [ 0, 0, 0, 0, 0, 0, 37, 0, 0, 0], [ 0, 0, 0, 2, 0, 0, 0, 35, 0, 1], [ 0, 0, 0, 6, 1, 0, 0, 1, 31, 1], [ 0, 0, 0, 0, 2, 0, 0, 0, 1, 33]], dtype=int64)

1条回答

网友

1楼 · 发布于 2024-09-30 01:21:28

第二个问题更容易处理。在RBF核中，伽马表示决策边界的摆动程度。“摇摆”是什么意思？gamma值越高，决策边界越精确。支持向量机的决策边界

if gamma='scale' (default) is passed then it uses 1 / (n_features *X.var()) as value of gamma,
if ‘auto’, uses 1 / n_features.

在第二种情况下，伽马更高。对于MNIST，标准偏差小于1。因此，第二个决策边界比前一种情况更精确，结果更好

更新：

相关问题更多 >

编程相关推荐

热门问题

热门文章