機器學習(3.2)--PCA降維鳶尾花數據降維演示

PCA(Principal components analysis)也稱主成分分析,是機器學習中降維的一種方法
本例使用數據集簡介:以鳶尾花的特徵作爲數據,共有數據集包含150個數據集,
分爲3類setosa(山鳶尾), versicolor(變色鳶尾), virginica(維吉尼亞鳶尾)
每類50個數據,每條數據包含4個屬性數據 和 一個類別數據.

本例通過這150個數據來演示降維後的最維效果,
因爲每個鳶尾花的特徵數據有4個屬性,我們想看這150點的分佈情況沒辦法繪製圖像
因此我們可以通過PCA降維,4維屬性降爲2維,就可以在二維平面上表示出來。
在最後plt出圖時點的分佈的界線還是比較清晰的,其實最後的這個二維平面散點圖,也可以幫助理解KNN算法

數據比較多,不太容易看出數據差別
如果想從數據變化上理解PCA降維,及更詳細的PCA計算流程
點擊查看 機器學習(3.1)--PCA降維基本原理

同時有另一篇文章同樣使用鳶尾花的特徵作爲數據,實現鄰近算法(KNN)

點擊查看  機器學習(2)--鄰近算法(KNN)

# -*- coding:utf-8 -*-  
data='''5.1,3.5,1.4,0.2,Iris-setosa
        4.9,3.0,1.4,0.2,Iris-setosa
        4.7,3.2,1.3,0.2,Iris-setosa
        4.6,3.1,1.5,0.2,Iris-setosa
        5.0,3.6,1.4,0.2,Iris-setosa
        5.4,3.9,1.7,0.4,Iris-setosa
        4.6,3.4,1.4,0.3,Iris-setosa
        5.0,3.4,1.5,0.2,Iris-setosa
        4.4,2.9,1.4,0.2,Iris-setosa
        4.9,3.1,1.5,0.1,Iris-setosa
        5.4,3.7,1.5,0.2,Iris-setosa
        4.8,3.4,1.6,0.2,Iris-setosa
        4.8,3.0,1.4,0.1,Iris-setosa
        4.3,3.0,1.1,0.1,Iris-setosa
        5.8,4.0,1.2,0.2,Iris-setosa
        5.7,4.4,1.5,0.4,Iris-setosa
        5.4,3.9,1.3,0.4,Iris-setosa
        5.1,3.5,1.4,0.3,Iris-setosa
        5.7,3.8,1.7,0.3,Iris-setosa
        5.1,3.8,1.5,0.3,Iris-setosa
        5.4,3.4,1.7,0.2,Iris-setosa
        5.1,3.7,1.5,0.4,Iris-setosa
        4.6,3.6,1.0,0.2,Iris-setosa
        5.1,3.3,1.7,0.5,Iris-setosa
        4.8,3.4,1.9,0.2,Iris-setosa
        5.0,3.0,1.6,0.2,Iris-setosa
        5.0,3.4,1.6,0.4,Iris-setosa
        5.2,3.5,1.5,0.2,Iris-setosa
        5.2,3.4,1.4,0.2,Iris-setosa
        4.7,3.2,1.6,0.2,Iris-setosa
        4.8,3.1,1.6,0.2,Iris-setosa
        5.4,3.4,1.5,0.4,Iris-setosa
        5.2,4.1,1.5,0.1,Iris-setosa
        5.5,4.2,1.4,0.2,Iris-setosa
        4.9,3.1,1.5,0.1,Iris-setosa
        5.0,3.2,1.2,0.2,Iris-setosa
        5.5,3.5,1.3,0.2,Iris-setosa
        4.9,3.1,1.5,0.1,Iris-setosa
        4.4,3.0,1.3,0.2,Iris-setosa
        5.1,3.4,1.5,0.2,Iris-setosa
        5.0,3.5,1.3,0.3,Iris-setosa
        4.5,2.3,1.3,0.3,Iris-setosa
        4.4,3.2,1.3,0.2,Iris-setosa
        5.0,3.5,1.6,0.6,Iris-setosa
        5.1,3.8,1.9,0.4,Iris-setosa
        4.8,3.0,1.4,0.3,Iris-setosa
        5.1,3.8,1.6,0.2,Iris-setosa
        4.6,3.2,1.4,0.2,Iris-setosa
        5.3,3.7,1.5,0.2,Iris-setosa
        5.0,3.3,1.4,0.2,Iris-setosa
        7.0,3.2,4.7,1.4,Iris-versicolor
        6.4,3.2,4.5,1.5,Iris-versicolor
        6.9,3.1,4.9,1.5,Iris-versicolor
        5.5,2.3,4.0,1.3,Iris-versicolor
        6.5,2.8,4.6,1.5,Iris-versicolor
        5.7,2.8,4.5,1.3,Iris-versicolor
        6.3,3.3,4.7,1.6,Iris-versicolor
        4.9,2.4,3.3,1.0,Iris-versicolor
        6.6,2.9,4.6,1.3,Iris-versicolor
        5.2,2.7,3.9,1.4,Iris-versicolor
        5.0,2.0,3.5,1.0,Iris-versicolor
        5.9,3.0,4.2,1.5,Iris-versicolor
        6.0,2.2,4.0,1.0,Iris-versicolor
        6.1,2.9,4.7,1.4,Iris-versicolor
        5.6,2.9,3.6,1.3,Iris-versicolor
        6.7,3.1,4.4,1.4,Iris-versicolor
        5.6,3.0,4.5,1.5,Iris-versicolor
        5.8,2.7,4.1,1.0,Iris-versicolor
        6.2,2.2,4.5,1.5,Iris-versicolor
        5.6,2.5,3.9,1.1,Iris-versicolor
        5.9,3.2,4.8,1.8,Iris-versicolor
        6.1,2.8,4.0,1.3,Iris-versicolor
        6.3,2.5,4.9,1.5,Iris-versicolor
        6.1,2.8,4.7,1.2,Iris-versicolor
        6.4,2.9,4.3,1.3,Iris-versicolor
        6.6,3.0,4.4,1.4,Iris-versicolor
        6.8,2.8,4.8,1.4,Iris-versicolor
        6.7,3.0,5.0,1.7,Iris-versicolor
        6.0,2.9,4.5,1.5,Iris-versicolor
        5.7,2.6,3.5,1.0,Iris-versicolor
        5.5,2.4,3.8,1.1,Iris-versicolor
        5.5,2.4,3.7,1.0,Iris-versicolor
        5.8,2.7,3.9,1.2,Iris-versicolor
        6.0,2.7,5.1,1.6,Iris-versicolor
        5.4,3.0,4.5,1.5,Iris-versicolor
        6.0,3.4,4.5,1.6,Iris-versicolor
        6.7,3.1,4.7,1.5,Iris-versicolor
        6.3,2.3,4.4,1.3,Iris-versicolor
        5.6,3.0,4.1,1.3,Iris-versicolor
        5.5,2.5,4.0,1.3,Iris-versicolor
        5.5,2.6,4.4,1.2,Iris-versicolor
        6.1,3.0,4.6,1.4,Iris-versicolor
        5.8,2.6,4.0,1.2,Iris-versicolor
        5.0,2.3,3.3,1.0,Iris-versicolor
        5.6,2.7,4.2,1.3,Iris-versicolor
        5.7,3.0,4.2,1.2,Iris-versicolor
        5.7,2.9,4.2,1.3,Iris-versicolor
        6.2,2.9,4.3,1.3,Iris-versicolor
        5.1,2.5,3.0,1.1,Iris-versicolor
        5.7,2.8,4.1,1.3,Iris-versicolor
        6.3,3.3,6.0,2.5,Iris-virginica
        5.8,2.7,5.1,1.9,Iris-virginica
        7.1,3.0,5.9,2.1,Iris-virginica
        6.3,2.9,5.6,1.8,Iris-virginica
        6.5,3.0,5.8,2.2,Iris-virginica
        7.6,3.0,6.6,2.1,Iris-virginica
        4.9,2.5,4.5,1.7,Iris-virginica
        7.3,2.9,6.3,1.8,Iris-virginica
        6.7,2.5,5.8,1.8,Iris-virginica
        7.2,3.6,6.1,2.5,Iris-virginica
        6.5,3.2,5.1,2.0,Iris-virginica
        6.4,2.7,5.3,1.9,Iris-virginica
        6.8,3.0,5.5,2.1,Iris-virginica
        5.7,2.5,5.0,2.0,Iris-virginica
        5.8,2.8,5.1,2.4,Iris-virginica
        6.4,3.2,5.3,2.3,Iris-virginica
        6.5,3.0,5.5,1.8,Iris-virginica
        7.7,3.8,6.7,2.2,Iris-virginica
        7.7,2.6,6.9,2.3,Iris-virginica
        6.0,2.2,5.0,1.5,Iris-virginica
        6.9,3.2,5.7,2.3,Iris-virginica
        5.6,2.8,4.9,2.0,Iris-virginica
        7.7,2.8,6.7,2.0,Iris-virginica
        6.3,2.7,4.9,1.8,Iris-virginica
        6.7,3.3,5.7,2.1,Iris-virginica
        7.2,3.2,6.0,1.8,Iris-virginica
        6.2,2.8,4.8,1.8,Iris-virginica
        6.1,3.0,4.9,1.8,Iris-virginica
        6.4,2.8,5.6,2.1,Iris-virginica
        7.2,3.0,5.8,1.6,Iris-virginica
        7.4,2.8,6.1,1.9,Iris-virginica
        7.9,3.8,6.4,2.0,Iris-virginica
        6.4,2.8,5.6,2.2,Iris-virginica
        6.3,2.8,5.1,1.5,Iris-virginica
        6.1,2.6,5.6,1.4,Iris-virginica
        7.7,3.0,6.1,2.3,Iris-virginica
        6.3,3.4,5.6,2.4,Iris-virginica
        6.4,3.1,5.5,1.8,Iris-virginica
        6.0,3.0,4.8,1.8,Iris-virginica
        6.9,3.1,5.4,2.1,Iris-virginica
        6.7,3.1,5.6,2.4,Iris-virginica
        6.9,3.1,5.1,2.3,Iris-virginica
        5.8,2.7,5.1,1.9,Iris-virginica
        6.8,3.2,5.9,2.3,Iris-virginica
        6.7,3.3,5.7,2.5,Iris-virginica
        6.7,3.0,5.2,2.3,Iris-virginica
        6.3,2.5,5.0,1.9,Iris-virginica
        6.5,3.0,5.2,2.0,Iris-virginica
        6.2,3.4,5.4,2.3,Iris-virginica
        5.9,3.0,5.1,1.8,Iris-virginica'''

import numpy as np
import matplotlib.pyplot as plt
data=data.replace(' ','').split('\n')
data=list(filter(lambda x: len(x)>0,data))
data=[x.split(',')[0:4] for x in data]#第五個數據是類別數據.在這裏沒有作用,直接捨棄
data=np.array(data).astype(np.float16)

#1、求平均值(dataAvg)
dataAvg=np.average(data,axis=0)
#2、求每個值與平均值的差(dataAdjust)
dataAdjust=dataAvg-data


#3、求協方差矩陣
#計算任意兩組數據協方差函數,
def covariance(getDataAdjust,index1,index2):
    x=getDataAdjust[:,index1:index1+1]
    y=getDataAdjust[:,index2:index2+1]
    n=x.shape[0]
    return (x*y).sum()/(n-1)

CovMatrix=[
            [covariance(dataAdjust,0,0),covariance(dataAdjust,0,1),covariance(dataAdjust,0,2),covariance(dataAdjust,0,3)]
           ,[covariance(dataAdjust,1,0),covariance(dataAdjust,1,1),covariance(dataAdjust,1,2),covariance(dataAdjust,1,3)]
           ,[covariance(dataAdjust,2,0),covariance(dataAdjust,2,1),covariance(dataAdjust,2,2),covariance(dataAdjust,2,3)]
           ,[covariance(dataAdjust,3,0),covariance(dataAdjust,3,1),covariance(dataAdjust,3,2),covariance(dataAdjust,3,3)]
           ]

#4、求協方差矩陣的特徵值與特徵向量
e1,e2=np.linalg.eig(CovMatrix)
print('------------------------')
print('協方差矩陣的特徵值:')
print(e1)#[4.22396988 0.24215651 0.07857844 0.02377251]
print('------------------------')
print('協方差矩陣的特徵向量:')
print(e2)

#5、降維
#我們要降到2維,在特值中,4.22396988 0.24215651 爲最大的兩個,因此我們特片向量的前兩列
finalX=e2[:,0:1]
finalY=e2[:,1:2]

#做矩陣乘法,
finalDataX=np.matmul(dataAdjust,e2[:,0:1])
finalDataY=np.matmul(dataAdjust,e2[:,1:2])


#6、繪製散點圖
# 用於定義X,Y軸的範圍
plt.axis([finalDataX.min().round(),finalDataX.max().round() 
          ,finalDataY.min().round(),finalDataY.max().round()])
plt.scatter(finalDataX[0:50],finalDataY[0:50], c='r') #前50個數據setosa(山鳶尾)的散點圖,用紅色表示
plt.scatter(finalDataX[50:100],finalDataY[50:100], c='g') #中50個數據versicolor(變色鳶尾)的散點圖,用綠色表示
plt.scatter(finalDataX[100:],finalDataY[100:], c='b') #後50個數據virginica(維吉尼亞鳶尾)的散點圖,用藍色表示
plt.show()


發佈了46 篇原創文章 · 獲贊 14 · 訪問量 2萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章