PCA在處理高維或低維圖片樣本中的應用

看了好多關於PCA的博客以及幾篇論文，終於對主成分分析有了一定的瞭解。PCA在人臉識別中的應用十分廣泛，不管人臉圖片是32*32的還是64*64的，這在我看來都是低維圖片，如果把單個像素看成一個特徵的話，32*32的有1024個特徵，64*64的有4096個特徵，這些特徵數對於電腦來說並不算太大。本文中的圖片是1024*1024的，單張圖片有1048576個特徵，這是很有降維的必要的，如果不降維的話，若是對大量的圖片做聚類的話，電腦根本就沒有那莫大的內存。

我用PCA處理980張1024*1024的電池片圖像，這980張電池片作爲訓練樣本，還有兩張作爲測試樣本（測試很有必要，否則不知道你得到的用於降維的矩陣Uk是否正確。）我們假設有m個樣本，每個樣本有n個特徵，將這n個特徵降到了k個（k<n）。降維矩陣爲Uk（n行k列），我們用1*n的一個測試樣本乘Uk,就得到了一個1*k的特徵向量Z。

首先解釋一下我遇到的幾個問題：

問題一：樣本數量和特徵數量的大小問題。

%% Initialization
clear ; close all; clc

fprintf('this code will load 12 images and do PCA for each face.\n');
fprintf('10 images are used to train PCA and the other 2 images are used to test PCA.\n');

m = 1000; % number of samples
trainset = zeros(m, 32*32); % image size is : 32 * 32
picturename = dir('E:\學習資料\MATLAB程序\pic_div\*.jpg');

for i = 1 : m
    roadname = strcat('E:\學習資料\MATLAB程序\pic_div\',picturename(i).name);
	img = imread(roadname);
    img = rgb2gray(img);
	img = double(img(:));
	trainset(i, :) = img';
end


%% before training PCA, do feature normalization
mu = mean(trainset);
trainset_norm = bsxfun(@minus, trainset, mu);
sigma = std(trainset_norm);
trainset_norm = bsxfun(@rdivide, trainset_norm, sigma);

%% we could save the mean face mu to take a look the mean face
imwrite(uint8(reshape(mu, 32, 32)), 'meanface.bmp');
% fprintf('mean face saved. paused\n');
% pause;

%% compute reduce matrix
X = trainset_norm; % just for convience
[m, n] = size(X);


%% 特徵分解
Cov =  X * X';   
[U1,S1] = eig(Cov);
[S1,D] = sort(diag(S1),'descend');
for i = 1:size(Cov,1)
U(:,i) = U1(:,D(i));
end
S = diag(S1);   %特徵值
U = X'*U;   %特徵向量


%% 此步用來確定所要降到的維數k
k = find(cumsum(diag(S))./sum(diag(S))>0.95)



%% 特徵向量歸一化
for i = 1:m
U(:,i) = U(:,i)/norm(U(:,i));  
end
sum(U.^2);


%% 用兩張圖片做測試，並求出降到K維的兩個特徵向量
test = zeros(2, 32 * 32);
for i = 1:1
    roadname = strcat('E:\學習資料\MATLAB程序\測試\','2_14.jpg')
	img = imread(roadname);
    imshow(img);
    img = rgb2gray(img);
	img = double(img);
	test(i, :) = img(:);
    
end

% test set need to do normalization
test_norm = bsxfun(@minus, test, mu);
test_norm = bsxfun(@rdivide, test_norm, sigma);

% reduction
Uk = U(:, 1:k);
Z = test_norm * Uk   %兩個K維特徵向量
fprintf('reduce done.\n');


% save eigen face
for i = 1:m
	ef = U(:, i)';
	img = ef;
	minVal = min(img);
	img = img - minVal;
	max_val = max(abs(img));
	img = img / max_val;
	img = reshape(img, 1024, 1024);
	imwrite(img, strcat('eigenface', int2str(i), '.bmp'));
end

%% for the test set images, we only minus the mean face,
% so in the reconstruct process, we need add the mean face back
Xp = Z * Uk';
% show reconstructed face
for i = 1:2
	face = Xp(i, :) + mu;
	face = reshape((face), 32, 32);
	imwrite(uint8(face), strcat('E:\學習資料\MATLAB程序\reconstruct\', int2str(i+1000), '.jpg'));
end
% 
% for the train set reconstruction, we minus the mean face and divide by standard deviation during the train
so in the reconstruction process, we need to multiby standard deviation first, 
and then add the mean face back
% 此步用來測試降維矩陣Uk是否正確，如果能夠通過這一步將圖像還原回去，就證明降維矩陣是正確的
trainset_re = trainset_norm * Uk; % reduction
trainset_re = trainset_re * Uk'; % reconstruction
for i = 1:m   %% m=20
	train = trainset_re(i, :);
	train = train .* sigma;
	train = train + mu;
	train = reshape(train, 1024, 1024);
	imwrite(uint8(train), strcat('./reconstruct/', int2str(i), 'train.bmp'));
end


fprintf('job done.\n');

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

PCA在處理高維或低維圖片樣本中的應用

釘釘打卡速度慢

Nginx R31 doc 官方文檔-01-nginx 如何安裝

Qt/C++音視頻開發74-合併標籤圖形/生成yolo運算結果圖形/文字和圖形合併成一個/水印濾鏡

挑戰程序設計競賽 2.2章習題 POJ - 3617 Best Cow Line 貪心

字節面試：MySQL什麼時候鎖表？如何防止鎖表？

.NET8連接SQL SERVER 2008 R2 報：證書鏈是由不受信任的頒發機構頒發的

golang開發環境搭建(win10)

python計算機視覺學習筆記——PIL庫的用法

Golang初學：獲取程序內存使用情況，std runtime

使用python中的matplotlib同樣可以畫出非常清晰的圖

Faster rcnn 模型更改（添加\刪除卷積層）

YOLOv3中anchor機制的理解

圖像切割matlab程序

譜聚類matlab算法實現及詳解

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結