Training a deep autoencoder or a classifier on MNIST digits_Rbm訓練(Matlab)

這是第一次閱讀matlab版的RBM程序所做的筆記，其中有好多沒有理解的地方，希望能跟各位博友一起學習、一起研究、一起討論，共同進步。

一、Rbm閱讀材料

http://en.wikipedia.org/wiki/Restricted_Boltzmann_machine

http://deeplearning.net/tutorial/rbm.html

二、Rbm訓練的基本原理

三、Rbm代碼分析

% Version 1.000 
%
% Code provided by Geoff Hinton and Ruslan Salakhutdinov 
%
% Permission is granted for anyone to copy, use, modify, or distribute this
% program and accompanying programs and documents for any purpose, provided
% this copyright notice is retained and prominently displayed, along with
% a note saying that the original programs are available from our
% web page.
% The programs and documents are distributed without any warranty, express or
% implied.  As the programs were written for research purposes only, they have
% not been tested to the degree that would be advisable in any important
% application.  All use of these programs is entirely at the user's own risk.

% This program trains Restricted Boltzmann Machine in which
% 訓練RBM，可視層是二值的，隨機的；隱藏層也一樣；它們之間的連接是對稱連接.
% visible, binary, stochastic pixels are connected to
% hidden, binary, stochastic feature detectors using symmetrically  
% weighted connections. Learning is done with 1-step Contrastive Divergence.
% 學習只採用一次的CD.
% The program assumes that the following variables are set externally:
%下面的變量是外部設置的.
% maxepoch  -- maximum number of epochs
%@這個變量有待後面分析
% numhid    -- number of hidden units 
%隱藏單元的數量
% batchdata -- the data that is divided into batches (numcases numdims numbatches)
%訓練集被分成塊：樣本個數*樣本的特徵維數
% restart   -- set to 1 if learning starts from beginning 
%有待理解@（如果學習從頭開始，把這個變量設置爲1？）

epsilonw      = 0.1;   % Learning rate for weights 
                       %控制權值的學習率
epsilonvb     = 0.1;   % Learning rate for biases of visible units
                       %控制可視單元的偏置的學習率
epsilonhb     = 0.1;   % Learning rate for biases of hidden units 
                       %控制隱藏單元偏置的學習率
weightcost  = 0.0002;  %@權值代價（有待理解）
initialmomentum  = 0.5;%@能量初始值
finalmomentum    = 0.9;%@最終能量值

[numcases numdims numbatches]=size(batchdata);
%@有左邊輸出變量有三個，這說明batchdata是三維的，第三維塊的個數

if restart ==1,
   restart=0;
   epoch=1;

% Initializing symmetric weights and biases. %初始化對稱權值和偏置
  vishid     = 0.1*randn(numdims, numhid);
  %編程時，一定先給所採用的變量設定初始的矩陣來存貯
  %可視層與隱藏層之間的權值矩陣:行爲輸入的維數numdims,列爲隱藏單元的總數
  hidbiases  = zeros(1,numhid);%隱藏層的偏置，維數等於隱藏單元的總數
  
  visbiases  = zeros(1,numdims);%可視層的偏置，維數等於可視單元的總數

  poshidprobs = zeros(numcases,numhid);
  %@pos、probs、numcases代表的含義有待求解
  %@猜測一下，poshidprobs是用來存放正樣本訓練集（numcases)通過各個隱藏單元的輸出值
  neghidprobs = zeros(numcases,numhid);
  %@猜測一下，neghidprobs是用來存放負樣本訓練集（numcases)通過各個隱藏單元的輸出值(概率）
  posprods    = zeros(numdims,numhid);
  %@猜測一下，posprobs是用來存放正樣本最終訓練出來的權值矩陣numdims,numhid
  negprods    = zeros(numdims,numhid);
  %@猜測一下，posprobs是用來存放負樣本最終訓練出來的權值矩陣numdims,numhid
  vishidinc  = zeros(numdims,numhid);
  %@“inc"有待理解，vishidinc是用來存放權值矩陣的中間值？
  hidbiasinc = zeros(1,numhid);
  %@“inc"有待理解，hidbiasinc是用來存放隱藏層的偏置？
  visbiasinc = zeros(1,numdims);
  %@“inc"有待理解，visbiasinc是用來存放可視層的偏置？
  batchposhidprobs=zeros(numcases,numhid,numbatches);
  %@batchposhidprobs有待理解
end

for epoch = epoch:maxepoch,
 fprintf(1,'epoch %d\r',epoch); 
 errsum=0;
 for batch = 1:numbatches,
 fprintf(1,'epoch %d batch %d\r',epoch,batch); 

%%%%%%%%% START POSITIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%@這個相當對隱藏層採樣，然後求解<vh>0,因爲<vh>0是就正的，所以取STRAT POSITIVE PHASE
%對於自編碼器來說，這應該是編碼階段
  data = batchdata(:,:,batch);
  poshidprobs = 1./(1 + exp(-data*vishid - repmat(hidbiases,numcases,1)));    
  batchposhidprobs(:,:,batch)=poshidprobs;
  %batchposhidprobs存放着每個樣對每個隱藏層單元狀態爲1的概率輸出值，每塊有100*1000個數
  %（對第一層來說）
  posprods    = data' * poshidprobs;
  poshidact   = sum(poshidprobs);%把100個樣本得出的隱藏層單元輸出值加起來
  posvisact = sum(data);%把塊100個樣本數據的各個特徵加起來

%%%%%%%%% END OF POSITIVE PHASE  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  poshidstates = poshidprobs > rand(numcases,numhid);
%判斷隱藏層單元輸出值是否大於一個隨機矩陣中對應元素的值。如果大於隨機矩陣中對應元素的值，將值改爲1.
%即是把隱藏層單元輸出值轉化爲0，1二值狀態

%%%%%%%%% START NEGATIVE PHASE  %%%%%%%%%%%%%%%%%%%%%%    %%%%%%%%%%%%%%%%%%%%%%%%%%%%
%@這個相當對隱藏層採樣，然後求解<vh>1,因爲<vh>1是就負的，所以取STRAT NEGTIVE PHASE
%對於自編碼器來說，這應該是解碼階段
  negdata = 1./(1 + exp(-poshidstates*vishid' - repmat(visbiases,numcases,1)));
  %有點像求條件概率P3（RBM) h0->v1->h1(poshidstates以隱藏層單元輸出的二值作爲馬爾科夫鏈的起始值，可視層第一次採樣的數據？
  neghidprobs = 1./(1 + exp(-negdata*vishid - repmat(hidbiases,numcases,1)));    
  negprods  = negdata'*neghidprobs;%@採樣得到的可視層數據值乘以採樣得到的隱藏層單元輸出值得出啥？
  neghidact = sum(neghidprobs);
  negvisact = sum(negdata); 

%%%%%%%%% END OF NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  err= sum(sum( (data-negdata).^2 ));
  %h0->v1 原始輸入數據跟正負採樣後產生的可視層數據的差，即是求重構誤差
  errsum = err + errsum;

   if epoch>5,
     momentum=finalmomentum;
   else
     momentum=initialmomentum;
   end;
%能量大小的選擇跟epoch的大小有關
%%%%%%%%% UPDATE WEIGHTS AND BIASES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%求完<vh>0和<vh>1後，就可以求解權值的增量了
    vishidinc = momentum*vishidinc + ...
                epsilonw*( (posprods-negprods)/numcases - weightcost*vishid);
    visbiasinc = momentum*visbiasinc + (epsilonvb/numcases)*(posvisact-negvisact);
    hidbiasinc = momentum*hidbiasinc + (epsilonhb/numcases)*(poshidact-neghidact);

    vishid = vishid + vishidinc;
    visbiases = visbiases + visbiasinc;
    hidbiases = hidbiases + hidbiasinc;

%%%%%%%%%%%%%%%% END OF UPDATES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

  end
  fprintf(1, 'epoch %4i error %6.1f  \n', epoch, errsum); 
  %最後求出每個epoch的errsum
end;

Training a deep autoencoder or a classifier on MNIST digits_Rbm訓練(Matlab)

Training a deep autoencoder or a classifier on MNIST digits_Rbm訓練(Matlab)

訓練Restricted Boltzmann Machine(RBM)十五個問題

善於溝通助成功

Gibbs_Sampling

zouxy09_Deep Learning（深度學習）學習筆記整理系列鏈接

image processing and computer vision web site

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結