MATLAB 抽取隨機數 MCMC原理

1、matlab自帶抽取隨機數的函數
注:只列舉各個函數名字,具體各個函數的用法可用help查看。
(1)正態分佈隨機數:randn(),normrnd(), mvnrnd(); 其中最後一個用於抽取聯合正態分佈的隨機數。
(2)均勻分佈隨機數:rand()
(3)beta分佈隨機數: betarnd() - Beta random numbers.
(4)二項分佈隨機數:binornd() -Binomial random numbers.
(5)卡方分佈隨機數:chi2rnd() -Chi square random numbers.
(6)指數分佈隨機數:exprnd() -Exponential random numbers.
(7)極值分佈隨機數:evrnd() - Extreme value random numbers
frnd - F random numbers.
gamrnd - Gamma random numbers.
geornd - Geometric random numbers.
gevrnd - Generalized extreme value random numbers.
gprnd - Generalized Pareto inverse random numbers.
hygernd - Hypergeometric random numbers.
iwishrnd - Inverse Wishart random matrix.
johnsrnd - Random numbers from the Johnson system of distributions.
lognrnd - Lognormal random numbers.
mhsample - Metropolis-Hastings algorithm. 可用mhsample()抽取馬爾科夫鏈,即MCMC抽樣可採用這個函數。
mnrnd - Multinomial random vectors.
mvnrnd - Multivariate normal random vectors.
mvtrnd - Multivariate t random vectors.
nbinrnd - Negative binomial random numbers.
ncfrnd - Noncentral F random numbers.
nctrnd - Noncentral t random numbers.
ncx2rnd - Noncentral Chi-square random numbers.
normrnd - Normal (Gaussian) random numbers.
pearsrnd - Random numbers from the Pearson system of distributions.
poissrnd - Poisson random numbers.
randg - Gamma random numbers (unit scale).
random - Random numbers from specified distribution.
randsample - Random sample from finite population.
raylrnd - Rayleigh random numbers.
slicesample - Slice sampling method. (MCMC中的切片抽樣方法)
trnd - T random numbers.
unidrnd - Discrete uniform random numbers.
unifrnd - Uniform random numbers.
wblrnd - Weibull random numbers.
wishrnd - Wishart random matrix.
參考文獻:[MATLAB中統計分析函數]http://wenku.baidu.com/link?url=fxtUOBzUiRwhPl0JD1H8gt_1Gce_YqTxAYWct-G_pehbkRIZYKTVo508rCKHi1OGvqq3M6QYSyRx43hZ5QCG3zSofx80o2wxLxzcfWsJcq7
2、MCMC原理
主要討論兩種形式的MCMC:Metropolis-Hastings 和Gibbs抽樣。
先理解MCMC中的兩種思想:Monte Carlo 積分和Markov chains。
一、Monte Carlo Integration

概率統計推斷中許多問題需要計算複雜的積分或者在大的結果空間內求和。如計算函數g(x) 的期望,其中x 是隨機變量,密度函數爲p(x) ,如果x 是連續隨機變量,則

E[g(x)]=g(x)p(x)dx

x 是離散隨機變量,則

E[g(x)]=g(x)p(x)

The general idea of Monte Carlo integration is to use samples to approximate the expectation of a complex distribution.(蒙特卡洛積分的一般思想是用抽樣的樣本矩近似複雜分佈的期望)

x(t),t=1,2,...,N 是從分佈p(x) 抽取的獨立樣本,因此,我們可用有限項求和近似上述積分:

E[g(x)]=1ni=1ng(x(t))

一般來說,隨着增加抽樣量n ,近似精度越來越高。Crucially,近似精度還依賴於樣本的相關性。當樣本是相關的,有效樣本規模減小。(When the samples are correlated, the effective sample size decreases. This is not an issue with the rejection sampler but a potential problem with MCMC approaches. 我是這麼理解的,對與Metropolis-Hastings算法來講,相關性不是問題,因爲採用了rejection策略,而對於Gibbs抽樣,需要注意相關性問題,因爲在Gibbs抽樣中抽得的樣本全留下。)

二、Markov chains
A markov chain is a stochastic process where we transition from one state to another state using a simple sequential procedure.設起始狀態爲x(1) ,轉移函數爲p(x(t)|x(t1)) (to determine the next state, x(2) conditional on the last state.) We then keep iterating to create a sequence of states:

x(1)x(2)x(t)

產生T個狀態的Markov鏈的步驟如下:
1. Set t=1
2. Generate a initial value u , and set x(t)=u .
3. Repeat
   t=t+1
   sample a new value u from the transition function p(x(t)|x(t1))
   set x(t)=u
4. Until t=T .

下面重點介紹MCMC,討論三種方法Metropolis,Metropolis-Hasting,Gibbs sampling。
MCMC關鍵的兩個分佈是target distribution和proposal distribution。MCMC的目的就是抽target distribution的樣本。

Metropolis算法
  Metropolis是MCMC所有方法中最簡單的,是Metropolis-Hastings的一種特殊情形,proposal分佈需要對稱(q(θ(t)|θ(t1))=q(θ(t1)|θ(t)) )。
  算法步驟:
  1. Set t=1
  2. Generate a initial value u , and set θ(t)=u .
  3. Repeat
     t=t+1
     Generate a proposal θ from q(θ|θ(t1))
     Evaluate the acceptance probability α=min(1,p(θ)p(θ(t1)))
     Generate a u from a Uniform(0,1) distribution
     If uα , accept the proposal and set θ(t)=θ ,else set θ(t)=θ(t1) .
  4.Until t=T .   
注意給定的proposal distribution 實際上是個條件分佈。從接受率公式可看出,target distribution可以是unnormalized。

Metropolis-Hastings算法
Metropolis-Hastings算法(MH)是Metropolis算法的generalized version。 算法步驟一樣,但是接受率需改爲

α=min(1,p(θ)p(θ(t1))q(θ(t1)|θ)q(θ|θ(t1)))

  proposal distribution的選取原則
  可以看出,在Metropolis算法和MH算法中,proposal distribution起到和很重要的作用。proposal distribution原則上可以任意選擇,常見有兩種簡單方式,一種是隨機遊動鏈,新值y 爲現在值x 加上一隨機變量z ,即y=x+z ,此時,qyt+1|yt(y|x)=q(yx) ,其中q 爲任一概率密度。另一種稱爲獨立鏈,新值y 與現在值x 無關,即qyt+1|yt(y|x)=q(y) ,其中g 爲任一概率密度。對於有界的隨機變量,注意應該建立合適的proposal distribution。Generally,一個好的rule是to use a proposal distribution has positive density on the same support as the target distribution. For example, if the target distribution has support over 0θ< ,the proposal distribution should have the same support. 

MH 用於多元抽樣
  兩種策略,blockwise updating和componentwise updating. 本文重點介紹後一種。因爲對第一種尋找合適的高維proposal 分佈比較難。另一個是拒絕率往往會很高。
  下面是兩維componentwise MH sampler steps:
  1. set t=1 .
  2. Generate an initial value u=(u1,u2,...,uN), and set θ(t)=u
  3 Repeat
    t=t+1
    Generate a proposal θ1 from q(θ1|θ(t1)1)
  Evaluate the acceptance probability α=min(1,p(θ1,θ(t1)2)p(θ(t1)1,θ(t1)2)q(θ(t1)1|θ1)q(θ1|θ(t1)1))
    Generate a u from a Uniform(0,1) distribution
    If uα , accept the proposal and set θ(t)1=θ1 ,else set θ(t)1=θ(t1)1 .
    Generate a proposal θ2 from q(θ2|θ(t1)2)
   Evaluate the acceptance probability α=min(1,p(θ(t)1,θ2)p(θ(t)1,θ(t1)2)q(θ(t1)2|θ2q(θ2|θ(t1)2))
    Generate a u from a Uniform(0,1) distribution
    If uα ,accept the proposal and set θ(t)2=θ2 ,else set θ(t)2=θ(t1)2 .
  4. Until t=T .
  
Gibbs sampling
 在Gibbs抽樣中,沒有rejecttion,因此提高了計算效率。另一個優勢是沒必要去尋找合適的proposal distribution。但是我們需要知道多元分佈的條件分佈,即the Gibbs sampler can only be applied in situations where we know the full conditional distributions of each component in the multivariate distribution conditioned on all other components.
 二元情況的Gibbs sampling步驟:
 1. set t=1 .
 2.Generate an initial value u=(u1,u2) and set θ(t)=u .
 3. Repeat
   t=t+1
   Sampleθ(t)1 from the conditional distribution f(θ1|θ2=θ(t1)2)
   Sample θ(t)2 from the conditional distribution f(θ2|θ1=θ(t)1)
 4. Until t=T .
 
參考文獻:[Computational statistics with matlab]http://psiexp.ss.uci.edu/research/teachingP205C/205C.pdf

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章