Machine Learning Experiment4： Logistic Regression and Newton’s Method 詳解+源代碼解析

回顧Logistic Regression的基本原理

關於sigmoid函數

極大似然與損失函數

牛頓法

實驗步驟與過程

首先，讀入數據並繪製原始數據散點圖

根據圖像，我們可以看出，左下大多爲負樣本，而右上多爲正樣本，劃分應該大致爲一個斜率爲負的直線。

定義預測方程：

此處使用sigmoid函數，定義爲匿名函數（因爲在MATLAB中內聯函數即將被淘汰）

定義損失函數和迭代次數

損失函數：

參數更新規則：

梯度：

注意：其中，參數theta初始化爲0，初始定義迭代次數爲20，然後觀察損失函數，調整到合適的迭代次數，發現當迭代次數達到6左右，就已經收斂。

計算結果：通過迭代計算theta

theta = zeros(n+1, 1); %theta 3x1

iteration=8

J = zeros(iteration, 1);

for i=1:iteration

z = x*theta; %列向量 80x1 x 80x3 y 80x1

J(i) =(1/m)*sum(-y.*log(g(z)) - (1-y).*log(1-g(z)));

H = (1/m).*x'*(diag(g(z))*diag(g(-z))*x);%3x3 轉換爲對角矩陣

delta=(1/m).*x' * (g(z)-y);

theta=theta-H\delta

%theta=theta-inv(H)*delta

end

計算得到的theta值爲：

theta = 3×1

  -16.3787

    0.1483

    0.1589

繪製圖像如下：

關於損失函數值與迭代次數的變化

預測：

預測成績1爲20，成績二維80不被錄取的概率：

prob = 1 - g([1, 20, 80]*theta)

計算得：

prob = 0.6680

即不被錄取的概率爲0.6680.

MATLAB源代碼

clc,clear
x=load('ex4x.dat')
y=load('ex4y.dat')
[m, n] = size(x);
x = [ones(m, 1), x];%增加一列
% find returns the indices of the
% rows meeting the specified condition
pos = find(y == 1); neg = find(y == 0);
% Assume the features are in the 2nd and 3rd
% columns of x
plot(x(pos, 2), x(pos,3), '+'); hold on
plot(x(neg, 2), x(neg, 3), 'o')
xlim([15.0 65.0])
ylim([40.0 90.0])
xlim([14.8 64.8])
ylim([40.2 90.2])
legend({'Admitted','Not Admitted'})
xlabel('Exam1 score')
ylabel('Exam2 score')
title('Training data')
g=@(z)1.0./(1.0+exp(-z))
% Usage: To find the value of the sigmoid 
% evaluated at 2, call g(2)
theta = zeros(n+1, 1);  %theta 3x1
iteration=8
J = zeros(iteration, 1);
for i=1:iteration
    z = x*theta;   %列向量 80x1     x 80x3   y 80x1   
    J(i) =(1/m)*sum(-y.*log(g(z)) - (1-y).*log(1-g(z)));
    H = (1/m).*x'*(diag(g(z))*diag(g(-z))*x);%3x3  轉換爲對角矩陣
    delta=(1/m).*x' * (g(z)-y);
    theta=theta-H\delta
    %theta=theta-inv(H)*delta
end
% Plot Newton's method result
% Only need 2 points to define a line, so choose two endpoints
plot_x = [min(x(:,2))-2,  max(x(:,2))+2];
% Calculate the decision boundary line
plot_y = (-1./theta(3)).*(theta(2).*plot_x +theta(1));
plot(plot_x, plot_y)
legend('Admitted', 'Not admitted', 'Decision Boundary')
hold off
figure
plot(0:iteration-1, J, 'o--', 'MarkerFaceColor', 'r', 'MarkerSize', 5)
xlabel('Iteration'); ylabel('J')
xlim([0.00 7.00])
ylim([0.400 0.700])
title('iteration and Jcost')
prob = 1 - g([1, 20, 80]*theta)

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Machine Learning Experiment4： Logistic Regression and Newton’s Method 詳解+源代碼解析

MATLAB源代碼

SQL優化-20231016

數據備份快照技術之第一次寫時複製（COW）和寫時重定向（ROW）

卷積神經網絡中的1x1卷積核

雲計算中的威脅作用者以及雲計算的威脅有哪些？

操作系統複習筆記（全）

小記：python中的argsort函數

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結