PCA在处理高维或低维图片样本中的应用

看了好多关于PCA的博客以及几篇论文，终于对主成分分析有了一定的了解。PCA在人脸识别中的应用十分广泛，不管人脸图片是32*32的还是64*64的，这在我看来都是低维图片，如果把单个像素看成一个特征的话，32*32的有1024个特征，64*64的有4096个特征，这些特征数对于电脑来说并不算太大。本文中的图片是1024*1024的，单张图片有1048576个特征，这是很有降维的必要的，如果不降维的话，若是对大量的图片做聚类的话，电脑根本就没有那莫大的内存。

我用PCA处理980张1024*1024的电池片图像，这980张电池片作为训练样本，还有两张作为测试样本（测试很有必要，否则不知道你得到的用于降维的矩阵Uk是否正确。）我们假设有m个样本，每个样本有n个特征，将这n个特征降到了k个（k<n）。降维矩阵为Uk（n行k列），我们用1*n的一个测试样本乘Uk,就得到了一个1*k的特征向量Z。

首先解释一下我遇到的几个问题：

问题一：样本数量和特征数量的大小问题。

%% Initialization
clear ; close all; clc

fprintf('this code will load 12 images and do PCA for each face.\n');
fprintf('10 images are used to train PCA and the other 2 images are used to test PCA.\n');

m = 1000; % number of samples
trainset = zeros(m, 32*32); % image size is : 32 * 32
picturename = dir('E:\学习资料\MATLAB程序\pic_div\*.jpg');

for i = 1 : m
    roadname = strcat('E:\学习资料\MATLAB程序\pic_div\',picturename(i).name);
	img = imread(roadname);
    img = rgb2gray(img);
	img = double(img(:));
	trainset(i, :) = img';
end


%% before training PCA, do feature normalization
mu = mean(trainset);
trainset_norm = bsxfun(@minus, trainset, mu);
sigma = std(trainset_norm);
trainset_norm = bsxfun(@rdivide, trainset_norm, sigma);

%% we could save the mean face mu to take a look the mean face
imwrite(uint8(reshape(mu, 32, 32)), 'meanface.bmp');
% fprintf('mean face saved. paused\n');
% pause;

%% compute reduce matrix
X = trainset_norm; % just for convience
[m, n] = size(X);


%% 特征分解
Cov =  X * X';   
[U1,S1] = eig(Cov);
[S1,D] = sort(diag(S1),'descend');
for i = 1:size(Cov,1)
U(:,i) = U1(:,D(i));
end
S = diag(S1);   %特征值
U = X'*U;   %特征向量


%% 此步用来确定所要降到的维数k
k = find(cumsum(diag(S))./sum(diag(S))>0.95)



%% 特征向量归一化
for i = 1:m
U(:,i) = U(:,i)/norm(U(:,i));  
end
sum(U.^2);


%% 用两张图片做测试，并求出降到K维的两个特征向量
test = zeros(2, 32 * 32);
for i = 1:1
    roadname = strcat('E:\学习资料\MATLAB程序\测试\','2_14.jpg')
	img = imread(roadname);
    imshow(img);
    img = rgb2gray(img);
	img = double(img);
	test(i, :) = img(:);
    
end

% test set need to do normalization
test_norm = bsxfun(@minus, test, mu);
test_norm = bsxfun(@rdivide, test_norm, sigma);

% reduction
Uk = U(:, 1:k);
Z = test_norm * Uk   %两个K维特征向量
fprintf('reduce done.\n');


% save eigen face
for i = 1:m
	ef = U(:, i)';
	img = ef;
	minVal = min(img);
	img = img - minVal;
	max_val = max(abs(img));
	img = img / max_val;
	img = reshape(img, 1024, 1024);
	imwrite(img, strcat('eigenface', int2str(i), '.bmp'));
end

%% for the test set images, we only minus the mean face,
% so in the reconstruct process, we need add the mean face back
Xp = Z * Uk';
% show reconstructed face
for i = 1:2
	face = Xp(i, :) + mu;
	face = reshape((face), 32, 32);
	imwrite(uint8(face), strcat('E:\学习资料\MATLAB程序\reconstruct\', int2str(i+1000), '.jpg'));
end
% 
% for the train set reconstruction, we minus the mean face and divide by standard deviation during the train
so in the reconstruction process, we need to multiby standard deviation first, 
and then add the mean face back
% 此步用来测试降维矩阵Uk是否正确，如果能够通过这一步将图像还原回去，就证明降维矩阵是正确的
trainset_re = trainset_norm * Uk; % reduction
trainset_re = trainset_re * Uk'; % reconstruction
for i = 1:m   %% m=20
	train = trainset_re(i, :);
	train = train .* sigma;
	train = train + mu;
	train = reshape(train, 1024, 1024);
	imwrite(uint8(train), strcat('./reconstruct/', int2str(i), 'train.bmp'));
end


fprintf('job done.\n');

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

PCA在处理高维或低维图片样本中的应用

工作中用到的脚本合集

微服务实践Aspire项目发布到远程k8s集群

通过f-string编写简洁高效的Python格式化输出代码

[转帖]20个常用的Linux工具命令

[转帖]PostgreSQL从小白到高手教程 - 第46讲：poc-tpch测试

24-5-18 X

使用python中的matplotlib同樣可以畫出非常清晰的圖

Faster rcnn 模型更改（添加\刪除卷積層）

YOLOv3中anchor機制的理解

圖像切割matlab程序

譜聚類matlab算法實現及詳解

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結