What is PCA ?

figure cited here, recommend reading: A step by step explanation of Principal Component Analysis

PCA,Principal Component Analysis, is a dimensionality-reduction method.
It can reduce the number of variables of a data set, using one or more components to represent the original data.

Principal components are constructed as linear combinations of the initial variables.

Geometrically speaking, principal components are new axes with the most spread out projection of all the data points.

The more spread out, the more variance they carry, the more information they can keep, so PCA can reduce the dimensionality and preserve as much information as possible.

Step 1: Standardization

This step transforms all the variables to the same scale, because PCA is quite sensitive regarding the variances of the initial variables.

Step 2: Compute the Covariance Matrix

This matrix can reflect relationships among all the variables, and high correlation means redundant information.

Step 3: Compute the eigenvectors and eigenvalues of the covariance matrix

The eigenvectors of the Covariance matrix are Principal Components,since these directions have the most variance, and eigenvalues are the amount of variance carried in each Principal Component.

Step 4: Keep p components

Rank the eigenvalues from highest to lowest, for example, PC1 may carry 95% of the variance and PC2 carries 5%. We can keep all components or discard some of lesser significance ones.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章