原始代碼可以從這裏(GitHub repository)一次性下載。需要注意的是有些數據需要自己去下載,比如,在做PCA的練習時,需要下載MNIST數據集,可以到THE MNIST DATABASE下載。
文章目錄
- Supervised Learning and Optimization
- [Linear Regression](http://ufldl.stanford.edu/tutorial/supervised/LinearRegression/)
- [Logistic Regression](http://ufldl.stanford.edu/tutorial/supervised/LogisticRegression/)
- [Vectorization](http://ufldl.stanford.edu/tutorial/supervised/Vectorization/)
- [Debugging: Gradient Checking](http://ufldl.stanford.edu/tutorial/supervised/DebuggingGradientChecking/)
- [Softmax Regression](http://ufldl.stanford.edu/tutorial/supervised/SoftmaxRegression/)
- @[toc]
文章目錄
- Supervised Learning and Optimization
- [Linear Regression](http://ufldl.stanford.edu/tutorial/supervised/LinearRegression/)
- [Logistic Regression](http://ufldl.stanford.edu/tutorial/supervised/LogisticRegression/)
- [Vectorization](http://ufldl.stanford.edu/tutorial/supervised/Vectorization/)
- [Debugging: Gradient Checking](http://ufldl.stanford.edu/tutorial/supervised/DebuggingGradientChecking/)
- [Softmax Regression](http://ufldl.stanford.edu/tutorial/supervised/SoftmaxRegression/)
- @[toc]
Supervised Learning and Optimization
Linear Regression
Exercise 1A:線性迴歸預測房價,只需補充目標函數及其梯度,計算公式見原網頁 :
補充代碼
linear_regression.m
%%% YOUR CODE HERE %%%
theta = theta';
%Compute the linear regression objective
for j = 1:m
f = f + (theta*X(:,j) - y(j))^2;
end
f = f/2;
%Compute the gradient of the objective
for j = 1:m
g = g + X(:,j)*(theta*X(:,j) - y(j));
end
實驗結果
如原文所述,訓練和測試誤差一般在和之間,本人實驗結果:
Optimization took 1.780584 seconds.
RMS training error: 4.731236
RMS testing error: 4.584099
Logistic Regression
上面的線性迴歸有兩個特點:
- 預測連續值(房價);
- 輸出是輸入的 線性函數();
- 代價函數爲均方誤差函數。
Logistic Regression:
- 預測離散值,通常用於分類;
- 輸出是輸入的非線性函數(sigmoid或Logistic 函數:,);
- 代價函數取交叉熵(概率模型推導,最大似然,見CS229 Notes)。
Exercise 1B:Logistic 分類,用於手寫體。只需補充目標函數及其梯度,計算公式見原網頁,推導見CS229 Notes:
補充代碼
與線性迴歸基本相同,只是假設,爲sigmoid函數,不再是線性函數。
%%% YOUR CODE HERE %%%
%Compute the linear regression objective and it's gradient
for j = 1:m
coItem = sigmoid(theta'*X(:,j));
f = f - y(j)*log(coItem) - (1-y(j))*log(1-coItem);
g = g + X(:,j)*(coItem-y(j));
end
實驗結果
如原網頁所述,最終訓練和測試精度都爲100%,本人實驗結果:
Optimization took 15.115248 seconds.
Training accuracy: 100.0%
Test accuracy: 100.0%
Vectorization
補充代碼
需要取消ex1a_linreg.m
和ex1b_logreg.m
文件中下面的註釋:
ex1a_linreg.m
% theta = rand(n,1);
% tic;
% theta = minFunc(@linear_regression_vec, theta, options, train.X, train.y);
% fprintf('Optimization took %f seconds.\n', toc);
ex1b_logreg.m
% theta = rand(n,1)*0.001;
% tic;
% theta=minFunc(@logistic_regression_vec, theta, options, train.X, train.y);
% fprintf('Optimization took %f seconds.\n', toc);
linear_regression_vec.m
%%% YOUR CODE HERE %%%
f = (norm(theta'*X - y))^2 / 2;
g = X*(theta'*X-y)';
logistic_regression_vec.m
%%% YOUR CODE HERE %%%
coItem = sigmoid(theta'*X);
f = -log(coItem)*y' -log(1-coItem)*(1-y)';
g = X*(coItem-y)';
實驗結果
速度快了好些,如下:
線性迴歸:
Optimization took 0.032485 seconds.(矢量化前約0.3s)
RMS training error: 4.023758
RMS testing error: 6.783703
Logistic分類:
Optimization took 3.419164 seconds.(矢量化前約12s)
Training accuracy: 100.0%
Test accuracy: 100.0%
Debugging: Gradient Checking
補充代碼
下面是一次進行上述線性迴歸Logistic分類練習的梯度檢驗代碼
grad_check_demo.m
%% for linear regression
% Load housing data from file.
data = load('housing.data');
data=data'; % put examples in columns
% Include a row of 1s as an additional intercept feature.
data = [ ones(1,size(data,2)); data ];
% Shuffle examples.
data = data(:, randperm(size(data,2)));
% Split into train and test sets
% The last row of 'data' is the median home price.
train.X = data(1:end-1,1:400);
train.y = data(end,1:400);
test.X = data(1:end-1,401:end);
test.y = data(end,401:end);
m=size(train.X,2);
n=size(train.X,1);
% Initialize the coefficient vector theta to random values.
theta0 = rand(n,1);
num_checks = 20;
% without vectorize
average_error = grad_check(@linear_regression, theta0, num_checks, train.X, train.y)
% vectorize
average_error = grad_check(@linear_regression_vec, theta0, num_checks, train.X, train.y)
%% for Logistic Classification
binary_digits = true;
[train,test] = ex1_load_mnist(binary_digits);
% Add row of 1s to the dataset to act as an intercept term.
train.X = [ones(1,size(train.X,2)); train.X];
test.X = [ones(1,size(test.X,2)); test.X];
% Training set dimensions
m=size(train.X,2);
n=size(train.X,1);
% Train logistic regression classifier using minFunc
options = struct('MaxIter', 100);
% First, we initialize theta to some small random values.
theta0 = rand(n,1)*0.001;
num_checks = 20;
% without vectorize
average_error = grad_check(@logistic_regression, theta0, num_checks, train.X, train.y)
% vectorize
average_error = grad_check(@logistic_regression_vec, theta0, num_checks, train.X, train.y)
實驗結果
驗證20次的平均誤差分別爲:
1.7030e-05(linear)
1.2627e-05(linear_vec)
6.0687e-06(Logistic)
8.1527e-06(Logistic_vec)
Softmax Regression
多分類,Logistic迴歸的推廣。
文章目錄
- Supervised Learning and Optimization
- [Linear Regression](http://ufldl.stanford.edu/tutorial/supervised/LinearRegression/)
- [Logistic Regression](http://ufldl.stanford.edu/tutorial/supervised/LogisticRegression/)
- [Vectorization](http://ufldl.stanford.edu/tutorial/supervised/Vectorization/)
- [Debugging: Gradient Checking](http://ufldl.stanford.edu/tutorial/supervised/DebuggingGradientChecking/)
- [Softmax Regression](http://ufldl.stanford.edu/tutorial/supervised/SoftmaxRegression/)
- @[toc]
文章目錄
- Supervised Learning and Optimization
- [Linear Regression](http://ufldl.stanford.edu/tutorial/supervised/LinearRegression/)
- [Logistic Regression](http://ufldl.stanford.edu/tutorial/supervised/LogisticRegression/)
- [Vectorization](http://ufldl.stanford.edu/tutorial/supervised/Vectorization/)
- [Debugging: Gradient Checking](http://ufldl.stanford.edu/tutorial/supervised/DebuggingGradientChecking/)
- [Softmax Regression](http://ufldl.stanford.edu/tutorial/supervised/SoftmaxRegression/)
- @[toc]