初識機器學習——線性迴歸單變量梯度下降（jupyter notebook 代碼）

原創

2018-08-31 04:59

In [1]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 
%matplotlib inline

取出圖像數據，列表示三通道的同類事物圖像數據

In [2]:

data = pd.read_table('ex1data1.txt', header=None,names=['x','y'],delimiter=',')
print data.head()
data.describe()  #自動計算x y均值、標準差、最值等

        x        y
0  6.1101  17.5920
1  5.5277   9.1302
2  8.5186  13.6620
3  7.0032  11.8540
4  5.8598   6.8233

Out[2]:

	x	y
count	97.000000	97.000000
mean	8.159800	5.839135
std	3.869884	5.510262
min	5.026900	-2.680700
25%	5.707700	1.986900
50%	6.589400	4.562300
75%	8.578100	7.046700
max	22.203000	24.147000

提取訓練數據，輸入和標籤

In [3]:

X = data.loc[:,'x']
Y = data.loc[:,'y']
one = np.ones(X.shape)
X1 = np.array([one,X])
#X1 = np.insert(X, 0, values=one, axis=0)
print X1.shape
print Y.shape

(2L, 97L)
(97L,)

迭代計算theta(線性方程參數) 損失函數方程 y = theta1 + theta2*x

In [4]:

def theta_cal(X, Y, a, num):
    theta1 = 0
    theta2 = 0
    res = np.array([0, 0])
    for count in range(num):    
        sum1 = 0
        sum2 = 0
        for i in range(X.shape[0]):
            sum1 += theta1+theta2*X[i]-Y[i]
            sum2 += (theta1+theta2*X[i]-Y[i])*X[i]
        theta1 = theta1 - a*sum1/X.shape[0]
        theta2 = theta2 - a*sum2/X.shape[0]
        temp = np.array([theta1, theta2])
        res = np.row_stack((res, temp))
    return res

根據得到的theta，計算損失值

In [7]:

def lost_fun(theta, X, Y):
    res = []
    for i in range(theta.shape[0]):
        lost = (theta[i].T).dot(X) - Y.T
        lost = lost**2
        lost = lost.sum()/X.shape[0]
        res.append(lost)
    return res

In [13]:

a = theta_cal(X, Y, 0.01, 3000)
print a

[[ 0.          0.        ]
 [ 0.05839135  0.6532885 ]
 [ 0.06289175  0.77000978]
 ..., 
 [-3.87798708  1.19124606]
 [-3.87801916  1.19124929]
 [-3.87805118  1.1912525 ]]

計算梯度下降的迭代損失值

In [14]:

lost = lost_fun(a, X1, Y)
lost = np.array(lost)
x=np.arange(0,lost.shape[0],1)
plt.axis([0, x.shape[0], 430, 600])
plt.plot(x, lost)
plt.show()

繪製散點圖和擬合曲線逼近圖

In [39]:

plt.scatter(X, Y)
x = np.linspace(0,25,25)
#基本上前10次梯度下降已經擬合出一條效果還行的曲線
for i in np.linspace(0, 10, 10):  
    i = i.astype(int)
    plt.plot(x, a[i,0]+a[i,1]*x)
plt.show()

In [ ]:

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

初識機器學習——線性迴歸單變量梯度下降（jupyter notebook 代碼）

2024年DataOps趨勢預測：AI不會取代數據工程師

雲原生週刊：K8s 中的服務和網絡｜ 2024.4.29

[轉帖]cpupower

今天，昨天，近七天，近30天，近90天，js封裝

華爲云云原生FinOps解決方案，釋放雲原生最大價值

linux服務器開發（一）： KMP算法

關於UML 依賴、關聯、聚合、組合幾種類間關係的思考

如何理解網絡編程中的阻塞、非阻塞I/O和同步、異步I/O

雲風輕量級協程coroutine源碼分析（linux系統下基於ucontext）

C/C++關於普通函數，成員函數，靜態成員函數，函數指針的理解

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結