UFLDL教程練習答案一（稀疏自編碼器和矢量化編程實現）

原創

2020-02-23 07:13

最近想研究下深度學習，一開始就看UFLDL(unsuprisedfeature learning and deep learning)教程了，特將課後習題答案放在這裏，作爲一個筆記。

筆記：

1：自編碼算法是一個無監督學習算法，它通過學習h_w,b(x) = x,因此最後的outputlayer單元數與inputlayer單元數量相等，而中間的hiddenlayer可以很大，這是加個稀疏懲罰項，就相當於使中間很多結點的激勵值爲0，這樣就是類似於PCA了。算法就是反向傳播，這裏不做介紹，可以看教程！

2：可視化自編碼器，習題中可視化的是W1，即需要學習的參數W1。這個我不是很理解，後來想了想，由於輸入的是圖像的一個個像素點，那麼每一個hidden layer 如a₁⁽²⁾= w₁₁x1+w₁₂*x2+w13*x3+…，~~不太理解，接着學習後面的看。

練習答案：

1：稀疏自編碼器

Step1：在sampleIMAGES.m文件中獲取生成訓練集的代碼，其中tic和toc是用來記時用的。

tic
image_size=size(IMAGES);
i=randi(image_size(1)-patchsize+1,1,numpatches);   %產生1*10000個隨機數 範圍在[1,image_size(1)-patchsize+1]之間
j=randi(image_size(2)-patchsize+1,1,numpatches);
k=randi(image_size(3),1,numpatches);              % 隨機的選取圖片 10000次
for num=1:numpatches
        patches(:,num)=reshape(IMAGES(i(num):i(num)+patchsize-1,j(num):j(num)+patchsize-1,k(num)),1,patchsize*patchsize);
end
toc

Step2：在sparseAutoencoderCost.m文件中完成前向傳播和後向傳播等相關代碼

%1.forward propagation
data_size=size(data);           % [64, 10000]
active_value2=repmat(b1,1,data_size(2));    % 將b1擴展10000列 25*10000
active_value3=repmat(b2,1,data_size(2));    % 將b2擴展10000列 64*10000
active_value2=sigmoid(W1*data+active_value2);  %隱結點的值 矩陣表示所有的樣本     25*10000 一列表示一個樣本 hidden 
active_value3=sigmoid(W2*active_value2+active_value3);   %輸出結點的值 矩陣表示所有的樣本  64*10000 一列表示一個樣本 output
%2.computing error term and cost
ave_square=sum(sum((active_value3-data).^2)./2)/data_size(2);   %cost第一項  最小平方和
weight_decay=lambda/2*(sum(sum(W1.^2))+sum(sum(W2.^2)));         %cost第二項   所有參數的平方和 貝葉斯學派

p_real=sum(active_value2,2)./data_size(2);       % 稀疏懲罰項中的估計p 爲25維 
p_para=repmat(sparsityParam,hiddenSize,1);       %稀疏化參數
sparsity=beta.*sum(p_para.*log(p_para./p_real)+(1-p_para).*log((1-p_para)./(1-p_real)));   %KL diversion
cost=ave_square+weight_decay+sparsity;      % 最終的cost function

delta3=(active_value3-data).*(active_value3).*(1-active_value3);      % 爲error 是64*10000 矩陣表示所有的樣本，每一列表示一個樣本
average_sparsity=repmat(sum(active_value2,2)./data_size(2),1,data_size(2));  %求error中的稀疏項
default_sparsity=repmat(sparsityParam,hiddenSize,data_size(2));     %稀疏化參數
sparsity_penalty=beta.*(-(default_sparsity./average_sparsity)+((1-default_sparsity)./(1-average_sparsity

Step3：梯度檢驗

EPSILON=0.0001;
for i=1:size(theta)
    theta_plus=theta;
    theta_minu=theta;
    theta_plus(i)=theta_plus(i)+EPSILON;
    theta_minu(i)=theta_minu(i)-EPSILON;
    numgrad(i)=(J(theta_plus)-J(theta_minu))/(2*EPSILON);
end

Step4:可視化，訓練train.m的時候，要將相關梯度校驗相關代碼去掉，因爲這部分代碼比較耗時間。

2：矢量化編程實現

這個只需要在以上的代碼中略做修改即可。

Step1：首先將參數設置爲

visibleSize = 28*28;   % number of input units 
hiddenSize = 196;     % number of hidden units 
sparsityParam = 0.1;   % desired average activation of the hidden units.
                     % (This was denoted by the Greek alphabet rho, which looks like a lower-case "p",
		     %  in the lecture notes). 
lambda = 3e-3;     % weight decay parameter       
beta = 3;            % weight of sparsity penalty term

Step2：將稀疏編碼器中的step1獲取訓練集的方式換成下面代碼：

images = loadMNISTImages('train-images.idx3-ubyte');

display_network(images(:,1:100)); % Show the first 100 images
patches = images(:, randi(size(images,2), 1, 10000));

這樣就可以得到以下可視化的結果了：

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

UFLDL教程練習答案一（稀疏自編碼器和矢量化編程實現）

最大熵模型簡記

leetcode ---雙指針+滑動窗口

UFLDL教程筆記及練習答案三（Softmax迴歸與自我學習***）

相似圖片搜索原理四(內容特徵法)

面試筆試

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結