UFLDL Tutorial - Supervised Learning and Optimization

原創

如若明镜

2020-02-20 19:07

UFLDL Tutorial

原始代碼可以從這裏（GitHub repository）一次性下載。需要注意的是有些數據需要自己去下載，比如，在做PCA的練習時，需要下載MNIST數據集，可以到THE MNIST DATABASE下載。

文章目錄
@[toc]
Supervised Learning and Optimization
[Linear Regression](http://ufldl.stanford.edu/tutorial/supervised/LinearRegression/)
[Logistic Regression](http://ufldl.stanford.edu/tutorial/supervised/LogisticRegression/)
[Vectorization](http://ufldl.stanford.edu/tutorial/supervised/Vectorization/)
[Debugging: Gradient Checking](http://ufldl.stanford.edu/tutorial/supervised/DebuggingGradientChecking/)
[Softmax Regression](http://ufldl.stanford.edu/tutorial/supervised/SoftmaxRegression/)
@[toc]

Supervised Learning and Optimization

Linear Regression

Exercise 1A：線性迴歸預測房價，只需補充目標函數及其梯度，計算公式見原網頁：

補充代碼

linear_regression.m

%%% YOUR CODE HERE %%%
theta = theta';
%Compute the linear regression objective
for j = 1:m
    f = f + (theta*X(:,j) - y(j))^2;
end
f = f/2;

%Compute the gradient of the objective
for j = 1:m
    g = g + X(:,j)*(theta*X(:,j) - y(j));
end

實驗結果

如原文所述，訓練和測試誤差一般在 $4.5$ 和 $5$ 之間，本人實驗結果：

Optimization took 1.780584 seconds.
RMS training error: 4.731236
RMS testing error: 4.584099

Logistic Regression

上面的線性迴歸有兩個特點：

預測連續值（房價）；
輸出是輸入的線性函數（ $y=h_{\theta}(x)=\theta^Tx$ ）；
代價函數爲均方誤差函數。

Logistic Regression：

預測離散值，通常用於分類；
輸出是輸入的非線性函數（sigmoid或Logistic 函數： $y=h_{\theta}(x)=\sigma(\theta^Tx)$ ， $\sigma(z)={1\over 1+exp(-z)}$ ）；
代價函數取交叉熵（概率模型推導，最大似然，見CS229 Notes）。

Exercise 1B：Logistic 分類，用於手寫體。只需補充目標函數及其梯度，計算公式見原網頁，推導見CS229 Notes：

補充代碼
與線性迴歸基本相同，只是假設 $y=h_{\theta}(x)=\sigma(\theta^Tx)$ ， $\sigma(z)={1\over 1+exp(-z)}$ 爲sigmoid函數，不再是線性函數。

%%% YOUR CODE HERE %%%

%Compute the linear regression objective and it's gradient
for j = 1:m
    coItem = sigmoid(theta'*X(:,j));
    f = f - y(j)*log(coItem) - (1-y(j))*log(1-coItem);
    g = g + X(:,j)*(coItem-y(j));
end

實驗結果

如原網頁所述，最終訓練和測試精度都爲100%，本人實驗結果：

Optimization took 15.115248 seconds.
Training accuracy: 100.0%
Test accuracy: 100.0%

Vectorization

補充代碼

需要取消ex1a_linreg.m和ex1b_logreg.m文件中下面的註釋：
ex1a_linreg.m

% theta = rand(n,1);
% tic;
% theta = minFunc(@linear_regression_vec, theta, options, train.X, train.y);
% fprintf('Optimization took %f seconds.\n', toc);

ex1b_logreg.m

% theta = rand(n,1)*0.001;
% tic;
% theta=minFunc(@logistic_regression_vec, theta, options, train.X, train.y);
% fprintf('Optimization took %f seconds.\n', toc);

linear_regression_vec.m

%%% YOUR CODE HERE %%%
f = (norm(theta'*X - y))^2 / 2;
g = X*(theta'*X-y)';

logistic_regression_vec.m

%%% YOUR CODE HERE %%%
coItem = sigmoid(theta'*X);
f = -log(coItem)*y' -log(1-coItem)*(1-y)';
g = X*(coItem-y)';

實驗結果

速度快了好些，如下：

線性迴歸：

Optimization took 0.032485 seconds.(矢量化前約0.3s)
RMS training error: 4.023758
RMS testing error: 6.783703

Logistic分類：

Optimization took 3.419164 seconds.（矢量化前約12s）
Training accuracy: 100.0%
Test accuracy: 100.0%

Debugging: Gradient Checking

補充代碼
下面是一次進行上述線性迴歸Logistic分類練習的梯度檢驗代碼
grad_check_demo.m

%% for linear regression

% Load housing data from file.
data = load('housing.data');
data=data'; % put examples in columns

% Include a row of 1s as an additional intercept feature.
data = [ ones(1,size(data,2)); data ];

% Shuffle examples.
data = data(:, randperm(size(data,2)));

% Split into train and test sets
% The last row of 'data' is the median home price.
train.X = data(1:end-1,1:400);
train.y = data(end,1:400);

test.X = data(1:end-1,401:end);
test.y = data(end,401:end);

m=size(train.X,2);
n=size(train.X,1);

% Initialize the coefficient vector theta to random values.
theta0 = rand(n,1);

num_checks = 20;
% without vectorize
average_error = grad_check(@linear_regression, theta0, num_checks, train.X, train.y)
% vectorize
average_error = grad_check(@linear_regression_vec, theta0, num_checks, train.X, train.y)

%% for Logistic Classification
binary_digits = true;
[train,test] = ex1_load_mnist(binary_digits);

% Add row of 1s to the dataset to act as an intercept term.
train.X = [ones(1,size(train.X,2)); train.X]; 
test.X = [ones(1,size(test.X,2)); test.X];

% Training set dimensions
m=size(train.X,2);
n=size(train.X,1);

% Train logistic regression classifier using minFunc
options = struct('MaxIter', 100);

% First, we initialize theta to some small random values.
theta0 = rand(n,1)*0.001;

num_checks = 20;
% without vectorize
average_error = grad_check(@logistic_regression, theta0, num_checks, train.X, train.y)
% vectorize
average_error = grad_check(@logistic_regression_vec, theta0, num_checks, train.X, train.y)

實驗結果

驗證20次的平均誤差分別爲：

1.7030e-05（linear）
1.2627e-05（linear_vec）
6.0687e-06（Logistic）
8.1527e-06（Logistic_vec）

Softmax Regression

多分類，Logistic迴歸的推廣。

文章目錄
@[toc]
Supervised Learning and Optimization
[Linear Regression](http://ufldl.stanford.edu/tutorial/supervised/LinearRegression/)
[Logistic Regression](http://ufldl.stanford.edu/tutorial/supervised/LogisticRegression/)
[Vectorization](http://ufldl.stanford.edu/tutorial/supervised/Vectorization/)
[Debugging: Gradient Checking](http://ufldl.stanford.edu/tutorial/supervised/DebuggingGradientChecking/)
[Softmax Regression](http://ufldl.stanford.edu/tutorial/supervised/SoftmaxRegression/)
@[toc]

如若明鏡

發佈了84 篇原創文章 · 獲贊 131 · 訪問量 41萬+

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

UFLDL Tutorial - Supervised Learning and Optimization

文章目錄

Supervised Learning and Optimization

Linear Regression

Logistic Regression

Vectorization

Debugging: Gradient Checking

Softmax Regression

文章目錄

CORS error 但是 status code 是200 OK

壓縮上傳的GPU數據的方案

使用skopeo同步鏡像

你的感覺可靠嗎？

基於VHDL的具有自動樂曲演奏功能的電子琴設計

虛擬機VirtualBox使用

GPU運算能力對比

灰度圖像之同現矩陣的求解算法與實現

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結