基于sklearn的高斯混合模型分类器精度不稳定

1条回答

网友

1楼 · 发布于 2024-06-28 06:45:06

简而言之：你应该简单地而不是使用GMM进行分类。你知道吗

长话短说。。。你知道吗

从相关线程的答案来看，Multiclass classification using Gaussian Mixture Models with scikit learn（原文强调）：

Gaussian Mixture is not a classifier. It is a density estimation method, and expecting that its components will magically align with your classes is not a good idea. [...] GMM simply tries to fit mixture of Gaussians into your data, but there is nothing forcing it to place them according to the labeling (which is not even provided in the fit call). From time to time this will work - but only for trivial problems, where classes are so well separated that even Naive Bayes would work, in general however it is simply invalid tool for the problem.

以及被申请人本人的评论（同样，在原文中强调）：

As stated in the answer - GMM is not a classifier, so asking if you are using "GMM classifier" correctly is impossible to answer. Using GMM as a classifier is incorrect by definition, there is no "valid" way of using it in such a problem as it is not what this model is designed to do. What you could do is to build a proper generative model per class. In other words construct your own classifier where you fit one GMM per label and then use assigned probability to do actual classification. Then it is a proper classifier. See github.com/scikit-learn/scikit-learn/pull/2468

（不管它值多少钱，你可能会注意到，respondent是一位深谋远虑的研究科学家，也是第一个被授予machine-learninggold badge的人）

进一步阐述（这就是为什么我没有简单地将问题标记为重复的原因）：

的确，在scikit学习文档中有一个标题为GMM classification的帖子：

Demonstration of Gaussian mixture models for classification.

我猜这在2017年写上述回复时并不存在。但是，深入研究所提供的代码，您将意识到GMM模型实际上是按照上面lejlot提出的方式在那里使用的；没有classifier.fit(X_train, y_train)-形式的语句，所有用法都是classifier.fit(X_train)形式的，即不使用实际的标签。你知道吗

这正是我们对类聚类算法（GMM就是这样）的期望，而不是对分类器的期望。同样，scikit learn提供了一个选项来提供GMM ^{} method中的标签：

fit(self, X, y=None)

你们在这里实际使用的（同样，可能在2017年还不存在，正如上面的回答所暗示的），但是，鉴于我们对GMMs及其用法的了解，还不清楚这个参数的用途（并且，请允许我说，scikit learn分享了一些实践，这些实践从纯粹的编程的角度来看可能是合理的，但从建模的角度来看却没有什么意义。你知道吗

最后一句话：尽管修复随机种子（如评论中所建议的）可能会出现“工作”，但相信一个根据随机种子给出0.4到0.7之间精确度范围的“分类器”可以说不是一个好主意。。。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章