6-1 what is gradient descent
損失函數J 減小的方向 就是 導數的負方向
n爲學習率learning rate,n影響獲得最優解的速度。
6-2 gradient descent simulation
[思想] 梯度下降法是,初始點的導數代表函數增減方向,J(θ-dJ/dθ)<=J(θ).
[code]
import numpy as np
import matplotlib.pyplot as plt
plot_x = np.linspace(-1., 6., 500)
plot_y = (plot_x - 2.5)**2 - 1 #設一個函數
def dJ(theta):#導數
return 2*(theta - 2.5)
def J(theta):#函數值
try:
return (theta - 2.5)**2 -1
except:
return float('inf')#無窮大
def plot_theta_history():#繪製參數theta的折線圖
plt.plot(plot_x, J(plot_x))#J 和theta的折線圖
plt.plot(np.array(theta_history), J(np.array(theta_history)), color='r', marker='+')#導數取值和對應J的折線圖
plt.show()
def gradient_descent(initial_theta, eta, n_iters=1e4, epsilon=1e-8):#初始求導點initial_theta,學習率eta,求導次數n_iters,J與上一次J的取值差
theta = initial_theta
i_iters = 0
theta_history.append(initial_theta)
j_history.append(J(initial_theta))
while i_iters < n_iters:
gradient = dJ(theta)
last_theta = theta
theta = theta - eta * gradient#梯度下降法公式 -n*(dJ/dθ),J(θ-dJ/dθ)<=J(θ)
theta_history.append(theta)
j_history.append(J(theta))
if(abs(J(theta) - J(last_theta)) < epsilon):#求導次數
break
i_iters += 1
eta = 0.01 #learning rate
theta = 0.#初始求導點
theta_history = []
j_history = []
gradient_descent(0., eta)
plot_theta_history()
print(theta_history)
print(j_history)
print(j_history[-3],j_history[-4],"the last but 2 difference value:",j_history[-4]-j_history[-3])
print(j_history[-1],j_history[-2],"the last difference value:",j_history[-2]-j_history[-1])
print(theta_history[-1])
6-3 多元線性迴歸