EM算法:對含有隱變量的概率模型參數的極大似然估計法;
每次迭代由兩步組成:E步,求期望;M步,求極大;
EM算法與初值的選擇有關,選擇不同的初值可能得到不同的參數估計值;
其中Yjk表示樣本j屬於模型k的概率;apha(k)表示模型k在整個模型中佔的比例;
下例爲從網上下的高斯混合模型(由兩個高斯模型組成)參數獲得的matlab代碼:
% EM_GMM_2 -- Compute expectation for GMM with two Gaussian Component, where
% mean are fixed as zero. Our goal is mainly to compute the variance for two
% Gaussians.計算由兩個高斯公式組成的高斯混合模型的方差(均值固定爲0);
function [V1, V2] = EM_GMM_2(X)
%% Initialize parameters
V1 = 0.5;
V2 = 0.0001;
pi = 0.5;
N = length(X);
pi_init=2*pi;
maxiter = 100;
iter=0;
while (abs(pi_init-pi)/pi_init > 1e-3 && iter < maxiter)
iter = iter+1;
pi_init = pi;
XxX = X.*X;
%% Expectation Step
pi_x_GPV2 = pi*GaussianProb(XxX,V2);
gamma = pi_x_GPV2 ./ ((1-pi)*GaussianProb(XxX,V1) + pi_x_GPV2);
%% Maximization Step
V1 = (1-gamma)'*(XxX) / sum(1-gamma);
V2 = gamma'*(XxX) / sum(gamma);
pi = sum(gamma)/N;
end
%% Function to compute the probability for Gaussian distribution獲得高斯分佈概率;
function [Y] = GaussianProb(XxX, V)
Y = exp(-(XxX)/(2*V))/sqrt(2*pi*V);