機器學習：python 實現一個linear regression

原創

2018-08-25 22:04

1.原理介紹

linear regression步驟：
1.導入數據
2.將數據分爲訓練集合測試集
（linear regression 分爲x_train, x_text, y_train, y_test）
3.導入線性迴歸算法
利用訓練集計算出模型參數
4.模型檢驗
利用測試集測試真實值和預測值的差異
（用x_test計算出y_predict，與y_test做比較，計算誤差）
5.打印結果

hθ (x)表示需要預測的變量（圖中指額度）
θ 是參數（反映自變量對結果的影響權重）
x是自變量（圖中指工資、年齡）
注意：x, θ 都是向量

首先我們需要使用一定的測試數據來調參，確定theta的值
那麼該怎麼確定呢？
新建一個ε(i) 表示誤差項

分析：
1.獨立同分布：每個人的工資、年齡獨立，而銀行提供的貸款依據是相同的
2.高斯分佈：ε(i) 一般不會太大，而且ε(i) 關於0對稱分佈，ε(i) 越趨於0概率越大

ε(i) =y(i) -θT x(i)
每一個i分量對結果都有影響，所以需要將每一個分量都相乘
ε(i) 越小擬合越接近，此時概率值越大，因此最後需要求L(θ )MAX

乘積不好處理，取對數
轉化爲求J(θ )MIN

求梯度計算得到J(θ )最小值
過程解釋：
1.Xθ -Y是一個列向量。平方和可以寫成向量的轉置乘以他本身。
2.A是對稱矩陣時∇θ (θT Aθ )=2Aθ
3.用python 語言表示最終的結果就是

import numpy as np
#調用numpy裏的求逆函數
X_=np.linalg.inv(X.T.dot(X))
#X.T表示轉置，X.dot(Y)表示矩陣相乘
theta=X.dot(X.T).dot(Y)

2.代碼實現

具體代碼實現就是：

import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets

class LinearRegression():
    def __init__(self):#新建變量
        self.w = None

    def fit(self, X, y):#訓練集的擬合
        X = np.insert(X, 0, 1, axis=1)#增加一個維度
        print (X.shape)        
        X_ = np.linalg.inv(X.T.dot(X))#公式求解
        self.w = X_.dot(X.T).dot(y)

    def predict(self, X):#測試集的測試反饋
        #h(theta)=theta.T.dot(X)
        # Insert constant ones for bias weights
        X = np.insert(X, 0, 1, axis=1)
        y_pred = X.dot(self.w)
        return y_pred

def mean_squared_error(y_true, y_pred):
#真實數據與預測數據之間的差值（平方平均）
    mse = np.mean(np.power(y_true - y_pred, 2))
    return mse

def main():
    #第一步：導入數據
    # Load the diabetes dataset
    diabetes = datasets.load_diabetes()

    # Use only one feature
    X = diabetes.data[:, np.newaxis, 2]
    print (X.shape)

    #第二步：將數據分爲訓練集以及測試集
    # Split the data into training/testing sets
    x_train, x_test = X[:-20], X[-20:]

    # Split the targets into training/testing sets
    y_train, y_test = diabetes.target[:-20], diabetes.target[-20:]

    #第三步：導入線性迴歸類（之前定義的）
    clf = LinearRegression()
    clf.fit(x_train, y_train)#訓練
    y_pred = clf.predict(x_test)#測試

    #第四步：測試誤差計算（需要引入一個函數）
    # Print the mean squared error
    print ("Mean Squared Error:", mean_squared_error(y_test, y_pred))

    #matplotlib可視化輸出
    # Plot the results
    plt.scatter(x_test[:,0], y_test,  color='black')#散點輸出
    plt.plot(x_test[:,0], y_pred, color='blue', linewidth=3)#預測輸出
    plt.show()

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

機器學習：python 實現一個linear regression

1.原理介紹

2.代碼實現

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

free AI online tools All In One

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU啓動那些事（12.A）- uSDHC eMMC啓動時間(RT1170)

linux安裝cuda和cudnn

Mellanox網卡開啓SR-IOV

模擬手機設備：使用 Playwright 實現移動端自動化測試

HTML 00 Tutorial

全面系統的AI學習路徑，幫助普通人也能玩轉AI

從零開始：使用 Playwright 腳本錄製實現自動化測試

uni-app實現上拉加載

極小曲面

python 實現 kNN 算法

1.Tensorflow安裝--基於Windows

Numpy 通過矩陣操作避免for循環之 [None, :, :]運用

一個行列式求導

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結