python的PCA

PCA是主成分分析,用來降維,用少量的變量去解釋大部分變量,使得原來相關的變成不相關的,獨立的變量。

sklearn.decomposition.PCA(n_components=None,copy=True,whiten=False)

n_components保留下來的特徵個數n,缺省是所有都保留。賦值爲int就是要保留幾個。賦值爲‘mle’,自動選取,使得滿足要求的方差滿分比。
copy,True就是原來的數據不會改變,False原始數據會改變。
whiten,白化,使得每個特徵有相同的方差。

#-*- coding: utf-8 -*-
#主成分分析 降維
import pandas as pd

#參數初始化
inputfile = 'E:/PythonMaterial/chapter4/demo/data/principal_component.xls'
outputfile = 'E:/PythonMaterial/chapter4/demo/data/dimention_reducted.xls' #降維後的數據

data = pd.read_excel(inputfile, header = None) #讀入數據

from sklearn.decomposition import PCA

pca = PCA()
pca.fit(data)
a=pca.components_ #返回模型的各個特徵向量
print a
b=pca.explained_variance_ratio_ #返回各個成分各自的方差百分比(貢獻率)
print " "
print b
[[-0.56788461 -0.2280431  -0.23281436 -0.22427336 -0.3358618  -0.43679539
  -0.03861081 -0.46466998]
 [-0.64801531 -0.24732373  0.17085432  0.2089819   0.36050922  0.55908747
  -0.00186891 -0.05910423]
 [-0.45139763  0.23802089 -0.17685792 -0.11843804 -0.05173347 -0.20091919
  -0.00124421  0.80699041]
 [-0.19404741  0.9021939  -0.00730164 -0.01424541  0.03106289  0.12563004
   0.11152105 -0.3448924 ]
 [ 0.06133747  0.03383817 -0.12652433 -0.64325682  0.3896425   0.10681901
  -0.63233277 -0.04720838]
 [-0.02579655  0.06678747 -0.12816343  0.57023937  0.52642373 -0.52280144
  -0.31167833 -0.0754221 ]
 [ 0.03800378 -0.09520111 -0.15593386 -0.34300352  0.56640021 -0.18985251
   0.69902952 -0.04505823]
 [ 0.10147399 -0.03937889 -0.91023327  0.18760016 -0.06193777  0.34598258
   0.02090066 -0.02137393]]

[  7.74011263e-01   1.56949443e-01   4.27594216e-02   2.40659228e-02
   1.50278048e-03   4.10990447e-04   2.07718405e-04   9.24594471e-05]
pca=PCA(n_components='mle')
newData=pca.fit_transform(data)#用它來降低維度
pd.DataFrame(newData).to_excel(outputfile)#保存結果
pca.inverse_transform(newData)#必要時可以用inverse_transform()來複原數據
[[ -8.19133694e+00  -1.69040279e+01   3.90991029e+00   7.48106686e+00
    5.16142203e-01]
 [ -2.85274026e-01   6.48074989e+00  -4.62870368e+00   5.01369607e+00
   -1.65278935e+00]
 [  2.37073907e+01   2.85245701e+00  -4.96523096e-01  -1.57285727e+00
   -2.09522277e-01]
 [  1.44320264e+01  -2.29917325e+00  -1.50272151e+00  -1.30763061e+00
    1.54047215e+00]
 [ -5.43045680e+00  -1.00070408e+01   9.52086923e+00  -5.63779544e+00
   -9.21974743e-01]
 [ -2.41595590e+01   9.36428589e+00   7.26578565e-01  -1.98622218e+00
   -9.98528392e-01]
 [  3.66134607e+00   7.60198615e+00  -2.36439873e+00   4.21318409e-02
   -8.48196502e-02]
 [ -1.39676121e+01  -1.38912398e+01  -6.44917778e+00  -2.92916826e+00
   -1.91994563e-01]
 [ -4.08809359e+01   1.32568529e+01   4.16539368e+00   1.21239981e+00
    1.33543444e+00]
 [  1.74887665e+00   4.23112299e+00  -5.89809954e-01  -1.57477365e+00
    4.10612180e-01]
 [  2.19432196e+01   2.36645883e+00   1.33203832e+00   4.39763606e+00
   -2.61113312e-02]
 [  3.67086807e+01   6.00536554e+00   3.97183515e+00  -1.54808393e+00
    3.00572729e-01]
 [ -3.28750663e+00  -4.86380886e+00   1.00424688e+00   8.51193030e-01
   -6.27109498e-01]
 [ -5.99885871e+00  -4.19398863e+00  -8.59953736e+00  -2.44159234e+00
    6.09616105e-01]]
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章