改進的迭代尺度法(Improved Iterative Scaling,IIS)是一種常見的優化算法,在最大熵模型和條件隨機場(Conditional Random Field,CRF)中都會用IIS進行相應的處理,從而提高算法的效率。
已知模型爲:
Pλ(y∣x)=Zλ(x)1exp(∑1nλifi(x,y))
式中:fi(x,y)是二值函數, λ是參數,Zλ(x) 是歸一化因子,滿足:
Zλ(x)=∑yexp(∑1nλifi(x,y))
由Pλ(y∣x)可得p~(x,y)似然函數:
L(λ)=∑x,yp~(x,y)logp(y∣x)
其中,P~(x,y)是樣本(x,y)出現的頻率。模型參數λ→λ+δ時,對數似然函數的改變量爲:
L(λ+δ)−L(λ)=∑x,yP~(x,y)logPλ+δ(y∣x)−∑x,yP~(x,y)logPλ(y∣x)=∑x,yP~(x,y)∑iδifi(x,y)−∑xP~(x)logZλ(x)Zλ+δ(x)
使用不等式 −logα≥1−α (恆成立問題,求導證明),建立對數似然函數改變量的下界:
L(λ+δ)−L(λ)≥∑x,yP~(x,y)∑iδifi(x,y)+1−∑xP~(x)Zλ(x)Zλ+δ(x)=∑x,yP~(x,y)∑iδifi(x,y)+1−∑xP~(x)∑yPλ(y∣x)exp(∑iδifi(x,y))
引入f#(x,y),滿足:
f#(x,y)=∑ifi(x,y)
記L(λ+δ)−L(λ)=A(δ∣λ)此時:
A(δ∣λ)=∑x,yP~(x,y)∑iδifi(x,y)+1−∑xP~(x)∑yPλ(y∣x)exp(f#(x,y)∑if#(x,y)δifi(x,y))
使用Jensen不等式:exp∑xp(x)q(x)≤∑xp(x)expq(x),此時:
A(δ∣λ)≥∑x,yP~(x,y)∑iδifi(x,y)+1−∑xP~(x)∑yPλ(y∣x)∑i(f#(x,y)fi(x,y))exp(δif#(x,y)))
記上式不等式右端爲:
B(δ∣λ)=∑x,yP~(x,y)∑iδifi(x,y)+1−∑xP~(x)∑yPλ(y∣x)∑i(f#(x,y)fi(x,y))exp(δif#(x,y)))
對 δi求導得:
αδiB(δ∣λ)=∑x,yP~(x,y)∑ifi(x,y)−∑xP~(x)∑yPλ(y∣x)∑i(fi(x,y)exp(δif#(x,y)))
令αδiB(δ∣λ)=0,可以求出δi,重複執行直到λ收斂。