PCA在處理高維或低維圖片樣本中的應用

    看了好多關於PCA的博客以及幾篇論文,終於對主成分分析有了一定的瞭解。PCA在人臉識別中的應用十分廣泛,不管人臉圖片是32*32的還是64*64的,這在我看來都是低維圖片,如果把單個像素看成一個特徵的話,32*32的有1024個特徵,64*64的有4096個特徵,這些特徵數對於電腦來說並不算太大。本文中的圖片是1024*1024的,單張圖片有1048576個特徵,這是很有降維的必要的,如果不降維的話,若是對大量的圖片做聚類的話,電腦根本就沒有那莫大的內存。

    我用PCA處理980張1024*1024的電池片圖像,這980張電池片作爲訓練樣本,還有兩張作爲測試樣本(測試很有必要,否則不知道你得到的用於降維的矩陣Uk是否正確。) 我們假設有m個樣本,每個樣本有n個特徵,將這n個特徵降到了k個(k<n)。降維矩陣爲Uk(n行k列),我們用1*n的一個測試樣本乘Uk,就得到了一個1*k的特徵向量Z。


    首先解釋一下我遇到的幾個問題:

    問題一:樣本數量和特徵數量的大小問題。


%% Initialization
clear ; close all; clc

fprintf('this code will load 12 images and do PCA for each face.\n');
fprintf('10 images are used to train PCA and the other 2 images are used to test PCA.\n');

m = 1000; % number of samples
trainset = zeros(m, 32*32); % image size is : 32 * 32
picturename = dir('E:\學習資料\MATLAB程序\pic_div\*.jpg');

for i = 1 : m
    roadname = strcat('E:\學習資料\MATLAB程序\pic_div\',picturename(i).name);
	img = imread(roadname);
    img = rgb2gray(img);
	img = double(img(:));
	trainset(i, :) = img';
end


%% before training PCA, do feature normalization
mu = mean(trainset);
trainset_norm = bsxfun(@minus, trainset, mu);
sigma = std(trainset_norm);
trainset_norm = bsxfun(@rdivide, trainset_norm, sigma);

%% we could save the mean face mu to take a look the mean face
imwrite(uint8(reshape(mu, 32, 32)), 'meanface.bmp');
% fprintf('mean face saved. paused\n');
% pause;

%% compute reduce matrix
X = trainset_norm; % just for convience
[m, n] = size(X);


%% 特徵分解
Cov =  X * X';   
[U1,S1] = eig(Cov);
[S1,D] = sort(diag(S1),'descend');
for i = 1:size(Cov,1)
U(:,i) = U1(:,D(i));
end
S = diag(S1);   %特徵值
U = X'*U;   %特徵向量


%% 此步用來確定所要降到的維數k
k = find(cumsum(diag(S))./sum(diag(S))>0.95)



%% 特徵向量歸一化
for i = 1:m
U(:,i) = U(:,i)/norm(U(:,i));  
end
sum(U.^2);


%% 用兩張圖片做測試,並求出降到K維的兩個特徵向量
test = zeros(2, 32 * 32);
for i = 1:1
    roadname = strcat('E:\學習資料\MATLAB程序\測試\','2_14.jpg')
	img = imread(roadname);
    imshow(img);
    img = rgb2gray(img);
	img = double(img);
	test(i, :) = img(:);
    
end

% test set need to do normalization
test_norm = bsxfun(@minus, test, mu);
test_norm = bsxfun(@rdivide, test_norm, sigma);

% reduction
Uk = U(:, 1:k);
Z = test_norm * Uk   %兩個K維特徵向量
fprintf('reduce done.\n');


% save eigen face
for i = 1:m
	ef = U(:, i)';
	img = ef;
	minVal = min(img);
	img = img - minVal;
	max_val = max(abs(img));
	img = img / max_val;
	img = reshape(img, 1024, 1024);
	imwrite(img, strcat('eigenface', int2str(i), '.bmp'));
end

%% for the test set images, we only minus the mean face,
% so in the reconstruct process, we need add the mean face back
Xp = Z * Uk';
% show reconstructed face
for i = 1:2
	face = Xp(i, :) + mu;
	face = reshape((face), 32, 32);
	imwrite(uint8(face), strcat('E:\學習資料\MATLAB程序\reconstruct\', int2str(i+1000), '.jpg'));
end
% 
% for the train set reconstruction, we minus the mean face and divide by standard deviation during the train
so in the reconstruction process, we need to multiby standard deviation first, 
and then add the mean face back
% 此步用來測試降維矩陣Uk是否正確,如果能夠通過這一步將圖像還原回去,就證明降維矩陣是正確的
trainset_re = trainset_norm * Uk; % reduction
trainset_re = trainset_re * Uk'; % reconstruction
for i = 1:m   %% m=20
	train = trainset_re(i, :);
	train = train .* sigma;
	train = train + mu;
	train = reshape(train, 1024, 1024);
	imwrite(uint8(train), strcat('./reconstruct/', int2str(i), 'train.bmp'));
end


fprintf('job done.\n');



發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章