UFLDL Exercise:Softmax Regression

原創

2018-08-27 22:09

這一節主要是使用softmax實現一個手寫數字的識別器，難點主要是在代價函數和梯度的矢量化寫法。

STEP 2: Implement softmaxCost

function [cost, grad] = softmaxCost(theta, numClasses, inputSize, lambda, data, labels)

% numClasses - the number of classes 
% inputSize - the size N of the input vector
% lambda - weight decay parameter
% data - the N x M input matrix, where each column data(:, i) corresponds to
%        a single test set
% labels - an M x 1 matrix containing the labels corresponding for the input data
%

% Unroll the parameters from theta
theta = reshape(theta, numClasses, inputSize);

numCases = size(data, 2);

groundTruth = full(sparse(labels, 1:numCases, 1));
cost = 0;

thetagrad = zeros(numClasses, inputSize);

%% ---------- YOUR CODE HERE --------------------------------------
%  Instructions: Compute the cost and gradient for softmax regression.
%                You need to compute thetagrad and cost.
%                The groundTruth matrix might come in handy.
m = theta * data;
m = bsxfun(@minus,m,max(m,[],1));
m = exp(m);
m = bsxfun(@rdivide, m, sum(m));
cost = -sum(sum(groundTruth .* log(m)))/size(data,2) + lambda/2*sum(sum(theta.^2));
thetagrad = -(groundTruth - m)*data'/size(data,2) + lambda*theta;
% ------------------------------------------------------------------
% Unroll the gradient matrices into a vector for minFunc
grad = [thetagrad(:)];
end

其中groundTruth的大小爲numClasses * numCases的矩陣,groundTruth橫座標表示的是類別，縱座標表示的是第幾個樣本，如果第j樣本屬於i類，那麼groundTruth[i][j]=1,其他groundTruth[x][j]=0(x!=i)

則根據公式

容易知道

groundTruth .* log(m)就等價於

理解了這個就不難得到cost的公式

至於thetagrad，根據公式

-(groundTruth - m)*data'得到了一個新的矩陣，它的大小跟theta是一樣，它的第j行就等價於下面的公式，因爲groundTruth - m的size爲numClasses*numCases，data的size爲inputSize*numCases，所以(groundTruth - m)*data'就有一個將所有樣本累加的作用（想象一下兩個矩陣相乘是怎麼樣的就容易明白了）

STEP 3: Gradient checking

運行它提供的代碼即可（DEBUG設爲true），下面是我測試的結果

STEP 4&5 :Learning parameters && Testing

softmaxPredict.m

function [pred] = softmaxPredict(softmaxModel, data)

% softmaxModel - model trained using softmaxTrain
% data - the N x M input matrix, where each column data(:, i) corresponds to
%        a single test set
%
% Your code should produce the prediction matrix 
% pred, where pred(i) is argmax_c P(y(c) | x(i)).
 
% Unroll the parameters from theta
theta = softmaxModel.optTheta;  % this provides a numClasses x inputSize matrix
pred = zeros(1, size(data, 2));

%% ---------- YOUR CODE HERE --------------------------------------
%  Instructions: Compute pred using theta assuming that the labels start 
%                from 1.
m = theta * data;
[~,pred] = max(m);
% ---------------------------------------------------------------------

end

接下來運行它提供的訓練和測試代碼即可，記得要把DEBUG設爲false

結果如下

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

UFLDL Exercise:Softmax Regression

STEP 2: Implement softmaxCost

STEP 3: Gradient checking

STEP 4&5 :Learning parameters && Testing

softmaxPredict.m

如何在低代碼平臺中引用 JavaScript ？

探究職業發展的關鍵：能力模型解讀

高效率使用windows

如何使用 JavaScript 獲取當前頁面幀率 FPS

工程款拖欠，農民工怎麼了？就得一直忍着委屈求全嗎？

HarmonyOS 實現下拉刷新，上拉加載更多

語音信號處理中的“窗函數”

智能決策新時代：可視化大屏是否能夠超越傳統白板？

解密Prompt系列28. LLM Agent之金融領域摸索：FinMem & FinAgent

分享幾個.NET開源的AI和LLM相關項目框架

《C#圖解教程》讀書筆記

UFLDL Exercise: Implement deep networks for digit classificationz

c++實現的cnn

linux常用命令備忘錄

UFLDL Exercise:Sparse Autoencoder

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結