EM算法:对含有隐变量的概率模型参数的极大似然估计法;
每次迭代由两步组成:E步,求期望;M步,求极大;
EM算法与初值的选择有关,选择不同的初值可能得到不同的参数估计值;
其中Yjk表示样本j属于模型k的概率;apha(k)表示模型k在整个模型中占的比例;
下例为从网上下的高斯混合模型(由两个高斯模型组成)参数获得的matlab代码:
% EM_GMM_2 -- Compute expectation for GMM with two Gaussian Component, where
% mean are fixed as zero. Our goal is mainly to compute the variance for two
% Gaussians.计算由两个高斯公式组成的高斯混合模型的方差(均值固定为0);
function [V1, V2] = EM_GMM_2(X)
%% Initialize parameters
V1 = 0.5;
V2 = 0.0001;
pi = 0.5;
N = length(X);
pi_init=2*pi;
maxiter = 100;
iter=0;
while (abs(pi_init-pi)/pi_init > 1e-3 && iter < maxiter)
iter = iter+1;
pi_init = pi;
XxX = X.*X;
%% Expectation Step
pi_x_GPV2 = pi*GaussianProb(XxX,V2);
gamma = pi_x_GPV2 ./ ((1-pi)*GaussianProb(XxX,V1) + pi_x_GPV2);
%% Maximization Step
V1 = (1-gamma)'*(XxX) / sum(1-gamma);
V2 = gamma'*(XxX) / sum(gamma);
pi = sum(gamma)/N;
end
%% Function to compute the probability for Gaussian distribution获得高斯分布概率;
function [Y] = GaussianProb(XxX, V)
Y = exp(-(XxX)/(2*V))/sqrt(2*pi*V);