【六、梯度下降法gradient descent】

6-1 what is gradient descent

損失函數J 減小的方向 就是 導數的負方向

n爲學習率learning rate,n影響獲得最優解的速度。

6-2 gradient descent simulation

[思想] 梯度下降法是,初始點的導數代表函數增減方向,J(θ-dJ/dθ)<=J(θ).

[code]

import numpy as np
import matplotlib.pyplot as plt
plot_x = np.linspace(-1., 6., 500)
plot_y = (plot_x - 2.5)**2 - 1 #設一個函數
def dJ(theta):#導數
        return 2*(theta - 2.5)
def J(theta):#函數值
    try:
        return (theta - 2.5)**2 -1
    except:
        return float('inf')#無窮大
def plot_theta_history():#繪製參數theta的折線圖
    plt.plot(plot_x, J(plot_x))#J 和theta的折線圖
    plt.plot(np.array(theta_history), J(np.array(theta_history)), color='r', marker='+')#導數取值和對應J的折線圖
    plt.show()
def gradient_descent(initial_theta, eta, n_iters=1e4, epsilon=1e-8):#初始求導點initial_theta,學習率eta,求導次數n_iters,J與上一次J的取值差
    theta = initial_theta
    i_iters = 0
    theta_history.append(initial_theta)
    j_history.append(J(initial_theta))
    while i_iters < n_iters:
        gradient = dJ(theta)
        last_theta = theta
        theta = theta - eta * gradient#梯度下降法公式 -n*(dJ/dθ),J(θ-dJ/dθ)<=J(θ)
        theta_history.append(theta)
        j_history.append(J(theta))
        if(abs(J(theta) - J(last_theta)) < epsilon):#求導次數
            break
        i_iters += 1
eta = 0.01 #learning rate
theta = 0.#初始求導點
theta_history = []
j_history = []
gradient_descent(0., eta)
plot_theta_history()
print(theta_history)
print(j_history)
print(j_history[-3],j_history[-4],"the last but 2 difference value:",j_history[-4]-j_history[-3])
print(j_history[-1],j_history[-2],"the last difference value:",j_history[-2]-j_history[-1])
print(theta_history[-1])

6-3 多元線性迴歸

 

 

 

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章