實驗要求可以參考deep learning的tutorial,Exercise:Convolution and Poling 卷積和池化。
本實驗通過卷積神經網絡對RGB彩色圖像進行分類,先通過CNN網絡從圖像從學習得到3200維度的特徵,然後訓練四分類的softmax分類器進行分類。
1、 神經網絡結構
整個網絡可以包括四部分,線性解碼器,卷積,池化和softmax迴歸。線性解碼器的輸入層8*8*3個neuron,隱含層爲400個neuron(都不包括bias結點),輸出層爲8*8*3個neuron,通過線性解碼器學習到特徵。
卷積的大小爲8*8(一層),池化大小爲19*19平均池化(一層)。
在這種結構下,我們給定64*64*3大小的RGB圖像,通過卷積操作得到400*57*57*3(64-8+1 =57)大小的矩陣(400爲隱含層的個數,每一個爲一個特徵),爲了方便實驗中把RGB三個通道進行了求和,得到400*57*57大小的數據。然後進行池化操作,得到400*3*3(57*19=3)大小3維數據,然後這個三維數據轉化成3600大小的向量來表示圖像。通過這個網絡後,每一張64*64*3大小的RGB圖像就變成了3600大小的向量,然後通過softmax迴歸對圖像進行分類。
2、 數據
實驗中的數據集爲STL-10圖像集,每一個數據是大小爲96x96標註的彩色圖像,這數據屬於airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck十個類中的一類。爲了減少計算時間,使用中只採用了airplane, car, cat, dog四個類的圖像。其中,訓練集大小爲2000,特測試集大小爲3200.
實驗數據以及預先把數據集表示成了一個四維矩陣,images(r, c, channel, image number),第一維爲行,第二維爲列,第三維爲通道(RGB),第三位表示圖像,根據這種表示方法,訓練解的大小爲64*64*3*2000。
3、 實驗結果
實驗中,實現對卷積和池化的代碼實現進行了檢驗。從下圖中,我們可以發現卷積和池化共分了8次進行(400/50=8),每一次進行50維度大小的計算,這樣做是爲了避免出現內存不足的情況。
最後用3200的特徵訓練四分類的softmax分類器,最後在測試集上的正確率爲80.406%(平均池化)。實驗中,我把平均池化用最大池化進行替代,最後得到的正確率爲78.563%,從中可以發現選擇不同的池化方式最最後的結果也會有比較大的影響。
實驗結果一
正確率(平均池化)
正確率(最大池化)
4、代碼
cnnConvolve.m
function convolvedFeatures = cnnConvolve(patchDim, numFeatures, images, W, b, ZCAWhite, meanPatch)
%cnnConvolve Returns the convolution of the features given by W and b with
%the given images
%
% Parameters:
% patchDim - patch (feature) dimension
% numFeatures - number of features
% images - large images to convolve with, matrix in the form
% images(r, c, channel, image number)
% W, b - W, b for features from the sparse autoencoder
% ZCAWhite, meanPatch - ZCAWhitening and meanPatch matrices used for
% preprocessing
%
% Returns:
% convolvedFeatures - matrix of convolved features in the form
% convolvedFeatures(featureNum, imageNum, imageRow, imageCol)
numImages = size(images, 4);
imageDim = size(images, 1);
imageChannels = size(images, 3);
convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);
% Instructions:
% Convolve every feature with every large image here to produce the
% numFeatures x numImages x (imageDim - patchDim + 1) x (imageDim - patchDim + 1)
% matrix convolvedFeatures, such that
% convolvedFeatures(featureNum, imageNum, imageRow, imageCol) is the
% value of the convolved featureNum feature for the imageNum image over
% the region (imageRow, imageCol) to (imageRow + patchDim - 1, imageCol + patchDim - 1)
%
% Expected running times:
% Convolving with 100 images should take less than 3 minutes
% Convolving with 5000 images should take around an hour
% (So to save time when testing, you should convolve with less images, as
% described earlier)
% -------------------- YOUR CODE HERE --------------------
% Precompute the matrices that will be used during the convolution. Recall
% that you need to take into account the whitening and mean subtraction
% steps
WT = W*ZCAWhite;%等效的網絡參數
b_mean = b - WT*meanPatch;%針對未均值化的輸入數據需要加入該項
% --------------------------------------------------------
convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);
for imageNum = 1:numImages
for featureNum = 1:numFeatures
% convolution of image with feature matrix for each channel
convolvedImage = zeros(imageDim - patchDim + 1, imageDim - patchDim + 1);
for channel = 1:3
% Obtain the feature (patchDim x patchDim) needed during the convolution
% ---- YOUR CODE HERE ----
feature = zeros(8,8); % You should replace this
patchSize = 64;
offset = (channel - 1)*patchSize;
feature = reshape(WT(featureNum, offset+1 : offset+patchSize), 8, 8);
% ------------------------
% Flip the feature matrix because of the definition of convolution, as explained later
feature = flipud(fliplr(squeeze(feature)));
% Obtain the image
im = squeeze(images(:, :, channel, imageNum));
% Convolve "feature" with "im", adding the result to convolvedImage
% be sure to do a 'valid' convolution
% ---- YOUR CODE HERE ----
convolvedoneChannel = conv2(im, feature, 'valid');
convolvedImage = convolvedImage + convolvedoneChannel;
% ------------------------
end
% Subtract the bias unit (correcting for the mean subtraction as well)
% Then, apply the sigmoid function to get the hidden activation
% ---- YOUR CODE HERE ----
convolvedImage = sigmoid(convolvedImage+b_mean(featureNum));
% ------------------------
% The convolved feature is the sum of the convolved values for all channels
convolvedFeatures(featureNum, imageNum, :, :) = convolvedImage;
end
end
end
function sigm = sigmoid(x)
sigm = 1./(1+exp(-x));
end
cnnPool.m
function pooledFeatures = cnnPool(poolDim, convolvedFeatures)
%cnnPool Pools the given convolved features
%
% Parameters:
% poolDim - dimension of pooling region
% convolvedFeatures - convolved features to pool (as given by cnnConvolve)
% convolvedFeatures(featureNum, imageNum, imageRow, imageCol)
%
% Returns:
% pooledFeatures - matrix of pooled features in the form
% pooledFeatures(featureNum, imageNum, poolRow, poolCol)
%
numImages = size(convolvedFeatures, 2);
numFeatures = size(convolvedFeatures, 1);
convolvedDim = size(convolvedFeatures, 3);
pooledFeatures = zeros(numFeatures, numImages, floor(convolvedDim / poolDim), floor(convolvedDim / poolDim));
% -------------------- YOUR CODE HERE --------------------
% Instructions:
% Now pool the convolved features in regions of poolDim x poolDim,
% to obtain the
% numFeatures x numImages x (convolvedDim/poolDim) x (convolvedDim/poolDim)
% matrix pooledFeatures, such that
% pooledFeatures(featureNum, imageNum, poolRow, poolCol) is the
% value of the featureNum feature for the imageNum image pooled over the
% corresponding (poolRow, poolCol) pooling region
% (see http://ufldl/wiki/index.php/Pooling )
%
% Use mean pooling here.
% -------------------- YOUR CODE HERE --------------------
resultDim = floor(convolvedDim / poolDim);
for imageNum = 1:numImages
for featureNum = 1:numFeatures
for poolRow = 1:resultDim
offsetRow = 1+(poolRow-1)*poolDim;
for poolCol = 1:resultDim
offsetCol = 1+(poolCol-1)*poolDim;
patch = convolvedFeatures(featureNum,imageNum,offsetRow:offsetRow+poolDim-1,...
offsetCol:offsetCol+poolDim-1);
pooledFeatures(featureNum,imageNum,poolRow,poolCol) = mean(patch(:)); %均值池化
%pooledFeatures(featureNum,imageNum,poolRow,poolCol) = max(patch(:)); %最大池化
end
end
end
end
end