Derivation (method of Lagrangian multiplier)
Derivation
First step:
Find α k ′ x \bm \alpha'_k \bm x α k ′ x that maximises var ( α k ′ x ) \text{var}(\bm \alpha'_k \bm x) var ( α k ′ x )
Choose normalisation constraint α k ′ α k = 1 \bm \alpha'_k \bm \alpha_k = 1 α k ′ α k = 1
max α k ′ Σ α k − λ ( α k ′ α k − 1 ) \max \bm \alpha'_k \bm \Sigma \bm \alpha_k - \lambda (\bm \alpha'_k \bm \alpha_k - 1) max α k ′ Σ α k − λ ( α k ′ α k − 1 )
Σ α k = λ α k ( eigenvector equation ) \hspace{5em} \bm \Sigma \bm \alpha_k = \lambda \bm \alpha_k \ (\text{eigenvector equation}) Σ α k = λ α k ( eigenvector equation )
Since var ( α k ′ x ) = λ α k ′ α k = λ \text{var}(\bm \alpha'_k \bm x) = \lambda \bm \alpha_k' \bm \alpha_k=\lambda var ( α k ′ x ) = λ α k ′ α k = λ , its maximum takes place when λ \lambda λ is the largest eigenvalue of Σ \bm \Sigma Σ and hence the first principal component is set as the largest eigenvector e 1 \bm e_1 e 1 .
Second step:
Additional constraint: cov ( α 1 ′ x , α 2 ′ x ) = α 2 ′ Σ α 1 = λ 1 α 2 ′ α 1 = 0 \text{cov}(\bm \alpha'_1 \bm x, \bm \alpha'_2 \bm x) = \bm \alpha'_2 \bm \Sigma \bm \alpha_1 = \lambda_1 \bm \alpha'_2 \bm \alpha_1=0 cov ( α 1 ′ x , α 2 ′ x ) = α 2 ′ Σ α 1 = λ 1 α 2 ′ α 1 = 0
max α 2 ′ Σ α 2 − λ 2 ( α 2 ′ α 2 − 1 ) − ϕ α 2 ′ α 1 ∂ ∂ α 2 : Σ α 2 − λ 2 α 2 − ϕ α 1 = 0 α 1 ′ Σ α 2 − λ 2 α 1 ′ α 2 − ϕ α 1 ′ α 1 = 0 ⇒ 0 − 0 − ϕ α 1 ′ α 1 = 0 ⇒ ϕ = 0 ⇒ Σ α 2 − λ 2 α 2 = 0 \begin{aligned}
\max \bm \alpha'_2 &\bm \Sigma \bm \alpha_2 - \lambda_2 (\bm \alpha'_2 \bm \alpha_2 - 1) - \phi \bm \alpha'_2 \bm \alpha_1\\
\frac{\partial}{\partial \bm \alpha_2}: &\bm \Sigma \bm \alpha_2 - \lambda_2 \bm \alpha_2 - \phi \bm \alpha_1 =0\\
&\bm \alpha_1' \bm \Sigma \bm \alpha_2 - \lambda_2 \bm \alpha_1' \bm \alpha_2 - \phi \bm \alpha_1' \bm \alpha_1 =0\\
\Rightarrow & 0 - 0 - \phi \bm \alpha_1' \bm \alpha_1 =0 \\
\Rightarrow &\phi = 0 \\
\Rightarrow &\bm \Sigma \bm \alpha_2 - \lambda_2 \bm \alpha_2=0
\end{aligned} max α 2 ′ ∂ α 2 ∂ : ⇒ ⇒ ⇒ Σ α 2 − λ 2 ( α 2 ′ α 2 − 1 ) − ϕ α 2 ′ α 1 Σ α 2 − λ 2 α 2 − ϕ α 1 = 0 α 1 ′ Σ α 2 − λ 2 α 1 ′ α 2 − ϕ α 1 ′ α 1 = 0 0 − 0 − ϕ α 1 ′ α 1 = 0 ϕ = 0 Σ α 2 − λ 2 α 2 = 0
參考文獻
Frank Wood, Principal Component Analysis, Columbia University http://www.stat.columbia.edu/~fwood/Teaching/w4315/Fall2009/pca.pdf