When faced with a large set of correlated variables, principal
components allow us to summarize this set with a smaller number of
representative variables that collectively explain most of the
variability in the original set. The principal component directions
are presented in Section 6.3.1 as directions in feature space along
which the original data are highly variable. These directions also
define lines and subspaces that are as close as possible to the data
cloud. To perform principal components regression, we simply use
principal components as predictors in a regression model in place of
the original larger set of variables.
Principal component analysis (PCA) refers to the process by which
principal components are computed, and the subsequent use of these
components in understanding the data. PCA is an unsupervised approach,
since it involves only a set of features X1, X2,...,Xp, and no
associated response Y. Apart from producing derived variables for use
in supervised learning problems, PCA also serves as a tool for data
visualization (visualization of the observations or visualization of
the variables). We now discuss PCA in greater detail, focusing on the
use of PCA as a tool for unsupervised data exploration, in keeping
with the topic of this chapter.
特征提取降低了数据的维数。这通常是为了创建一个更小的系统(以减少计算开销)和/或减少噪声(以获得更清晰的信号)。你知道吗
关于无监督学习(p.373),在统计学习导论中有一个简明的介绍,我想这就是你想要的。你知道吗
以PCA为例。统计学习导论:
我的go-to资源是统计学习的元素(这是免费提供的here)。534页以后有一个PCA的详细讨论,将其应用于手写,使问题更容易处理。你知道吗
相关问题 更多 >
编程相关推荐