標準方程法的改進–嶺迴歸算法的python實現實現
前言: 嶺迴歸相比標準方程法就是多了一個嶺係數,目的是防止數據不可逆導致的參數無法估計的問題,關於嶺迴歸的具體理論請點擊此處,到我的一篇博文中查看
一、嶺迴歸公式推導
二、嶺迴歸算法python實現代碼
此處嶺係數默認爲0.2,對於程序中的數據比較合適,而具體應用中需要結合交叉驗證法,試驗多次才能找到合適的嶺係數
import numpy as np
from numpy import genfromtxt
import matplotlib.pyplot as plt
# 讀取數據
data = genfromtxt(r'longley.csv', delimiter=',')
# 切分數據
x_data = data[1:, 2:]
y_data = data[1:, 1, np.newaxis]
# 給樣本添加偏置項
X_data = np.concatenate((np.ones((16, 1)), x_data), axis=1)
# 嶺迴歸求解線性模型參數值
def weights(xArr, yArr, lam=0.2):
xMat = np.mat(xArr)
yMat = np.mat(yArr)
xTx = xMat.T * xMat
rxTx = xTx + np.eye(xMat.shape[1]) * lam
if np.linalg.det(rxTx) == 0.0:
print('數據不可逆,不能求解權重(參數)值')
return
ws = rxTx.I * xMat.T * yMat
return ws
ws = weights(X_data, y_data)
print(ws)
# 計算預測值
print(np.mat(X_data) * np.mat(ws))
print(y_data) # 真實值
print(y_data - (np.mat(X_data) * np.mat(ws))) # 真實值和預測值之差
三、算法運行結果
[[ 7.38107538e-04]
[ 2.07703836e-01]
[ 2.10076376e-02]
[ 5.05385441e-03]
[-1.59173066e+00]
[ 1.10442920e-01]
[-2.42280461e-01]]
[[ 83.55075226]
[ 86.92588689]
[ 88.09720228]
[ 90.95677622]
[ 96.06951002]
[ 97.81955375]
[ 98.36444357]
[ 99.99814266]
[103.26832266]
[105.03165135]
[107.45224671]
[109.52190685]
[112.91863666]
[113.98357055]
[115.29845063]
[117.64279933]]
[[ 83. ]
[ 88.5]
[ 88.2]
[ 89.5]
[ 96.2]
[ 98.1]
[ 99. ]
[100. ]
[101.2]
[104.6]
[108.4]
[110.8]
[112.6]
[114.2]
[115.7]
[116.9]]
[[-5.50752262e-01]
[ 1.57411311e+00]
[ 1.02797725e-01]
[-1.45677622e+00]
[ 1.30489977e-01]
[ 2.80446252e-01]
[ 6.35556429e-01]
[ 1.85733868e-03]
[-2.06832266e+00]
[-4.31651346e-01]
[ 9.47753293e-01]
[ 1.27809315e+00]
[-3.18636658e-01]
[ 2.16429451e-01]
[ 4.01549370e-01]
[-7.42799331e-01]]
Process finished with exit code 0
由結果可知模型的擬合效果較好
四、數據下載
鏈接:https://pan.baidu.com/s/14xi9nAW4DyY3mWFp_GTb0w
提取碼:3kf3