算法強化 —— 提升樹算法(三)

二分類問題

對於二分類問題,原論文中使用的對數損失函數:
L(y,F)=log(1+exp(2yF)),y1,1L(y,F) = log(1+exp(-2yF)),y \in -1,1
其中
F(x)=12log[Pr(y=1x)Pr(y=1x)]F(x) = \frac{1}{2}log \left[\frac{Pr(y=1|x)}{Pr(y=-1|x)} \right]
那麼按照上面的算法一步步進行計算,首先計算負梯度
y~i=[L(y,F(xi))F(xi)]F(x)=Fm1(x)=2yi1+exp(2yiFm1(xi))\tilde{y}_{i}=-\left[\frac{\partial L\left(y, F\left(x_{i}\right)\right)}{\partial F\left(x_{i}\right)}\right]_{F(x)=F_{m-1}(x)}=\frac{2 y_{i}}{1+\exp \left(2 y_{i} F_{m-1}\left(x_{i}\right)\right)}

然後估計葉子節點的值
γjm=argminγxiRmlog(1+exp(2yi(Fm1(xi)+γ)))\gamma_{j m}=\operatorname{argmin}_{\gamma} \sum_{x_{i} \in R_{m}} \log \left(1+\exp \left(-2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right)
原論文中,直接使用Newton-Raphson方法得出近似結果,
γjm=xiRmy~ixiRmy~i(2y~i)\gamma_{j m}=\frac{\sum_{x_{i} \in R_{m}} \tilde{y}_{i}}{\sum_{x_{i} \in R_{m}}\left|\tilde{y}_{i}\right|\left(2-\left|\tilde{y}_{i}\right|\right)}

初始值如何設置

在梯度提升樹算法中,我們知道,初始值的設置是:
Fo(x)=argmini=1NL(yi,F(xi))F_o(x) = argmin \sum_{i=1}^N L(y_i,F(x_i))
我們讓損失函數L對F求偏導,並令偏導爲0,求極值
i=1NL(yi,F(xi))F=0i=1N(2yi)e2yiFe2yiF+1=0\begin{aligned} &\frac{\partial \sum_{i=1}^{N} L\left(y_{i}, F\left(x_{i}\right)\right)}{\partial F}=0\\ &\sum_{i=1}^{N} \frac{\left(-2 y_{i}\right) e^{-2 y_{i} F}}{e^{-2 y_{i} F}+1}=0 \end{aligned}
由於是二分類,所以yi的取值是1和-1,所以有
i:yi=12e2Fe2F+1+i:yi=12e2Fe2F+1=0\sum_{i:y_i=1} \frac{2e^{-2F}}{e^{-2F}+1} + \sum_{i:y_i=-1} \frac{-2e^{2F}}{e^{2F}+1} = 0
將分母處理成一致:
\sum_{i:y_i=1} \frac{2}{e^{2F}+1} + \sum_{i:y_i=-1} \frac{-2e{2F}}{e{2F}+1} = 0
設正樣本數量爲m個,負樣本數量爲n個,則有:
mne2F=0m-ne^{2F} = 0
e2F=mn=1+mnm+n1mnm+n=1+yˉ1yˉe^{2F} = \frac{m}{n} = \frac{1+\frac{m-n}{m+n}}{1-\frac{m-n}{m+n}} = \frac{1+\bar{y}}{1-\bar{y}}
m+n表示樣本總數,m-n表示yi求和
最終可以得出
Fo(X)=12log1+yˉ1yˉF_o(X) = \frac{1}{2}log \frac{1+\bar{y}}{1-\bar{y}}

牛頓近似法求解

如何將公式1轉化爲公式2
γjm=argminγxiRmlog(1+exp(2yi(Fm1(xi)+γ)))\gamma_{j m}=\operatorname{argmin}_{\gamma} \sum_{x_{i} \in R_{m}} \log \left(1+\exp \left(-2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right)
γjm=xiRmy~ixiRmy~i(2y~i)\gamma_{j m}=\frac{\sum_{x_{i} \in R_{m}} \tilde{y}_{i}}{\sum_{x_{i} \in R_{m}}\left|\tilde{y}_{i}\right|\left(2-\left|\tilde{y}_{i}\right|\right)}
首先,牛頓法是一種迭代求解的方法,論文中提到進一步迭代,我們首先令:
g(γ)=xiRjmlog(1+exp(2yi(Fm1(xi+γ))))g(\gamma) = \sum_{x_i \in R_{jm}} log (1+exp(-2y_i(F_{m-1}(x_i+\gamma))))
然後使用牛頓法求解γ0=0\gamma_0 = 0開始迭代
γjm=γ0g(γ0)g(γ0)=g(γ0)g(γ0)\gamma_{j m}=\gamma_{0}-\frac{g^{\prime}\left(\gamma_{0}\right)}{g^{\prime \prime}\left(\gamma_{0}\right)}=-\frac{g^{\prime}\left(\gamma_{0}\right)}{g^{\prime \prime}\left(\gamma_{0}\right)}
然後分別對γ\gamma進行一階求導和二階求導
g(γ)=xiRjm2yi1+exp(2yi(Fm1(xi)+γ))g^{\prime}(\gamma)=\sum_{x_{i} \in R_{j m}} \frac{-2 y_{i}}{1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)}
g(γ)=xiRjm4yi2exp(2yi(Fm1(xi)+γ))[1+exp(2yi(Fm1(xi)+γ))]2=xiRjm4yi2(exp(2yi(Fm1(xi)+γ))+1)4yi2[1+exp(2yi(Fm1(xi)+γ))]2g^{\prime \prime}(\gamma)=\sum_{x_{i} \in R_{j m}} \frac{4 y_{i}^{2} \exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)}{\left[1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right]^{2}}=\sum_{x_{i} \in R_{jm}} \frac{4 y_{i}^{2}\left(\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)+1\right)-4 y_{i}^{2}}{\left[1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right]^{2}}
然後由於
y~i=[L(y,F(xi))F(xi)]F(x)=Fm1(x)=2yi1+exp(2yiFm1(xi))\tilde{y}_{i}=-\left[\frac{\partial L\left(y, F\left(x_{i}\right)\right)}{\partial F\left(x_{i}\right)}\right]_{F(x)=F_{m-1}(x)}=\frac{2 y_{i}}{1+\exp \left(2 y_{i} F_{m-1}\left(x_{i}\right)\right)}
所以可以近似的得出
g(γ)=xiRjm2yi1+exp(2yi(Fm1(xi)+γ))=y~ig^{\prime}(\gamma)=\sum_{x_{i} \in R_{j m}} \frac{-2 y_{i}}{1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)} = -\tilde{y}_{i}
g(γ)=xiRjm4yi2(exp(2yi(Fm1(xi)+γ))+1)4yi2[1+exp(2yi(Fm1(xi)+γ))]2g^{\prime \prime}(\gamma)=\sum_{x_{i} \in R_{j m}} \frac{4 y_{i}^{2} (\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)+1)-4y_i^2}{\left[1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right]^{2}}
=xiRjm[22yi2[1+exp(2yi(Fm1(xi)+γ))]yi2~]=\sum_{x_{i} \in R_{jm}}\left[ \frac{2*2y_i^2}{\left[1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right]} -\tilde{y_i^2}\right]
由於yi取值爲+1或者-1,所以yi2=yiy_i^2 = |y_i|,所以有:
g(γ)=yi~(2yi~)g^{\prime \prime}(\gamma) = |\tilde{y_i}|(2-|\tilde{y_i}|)

二分類問題

最終我們求出F(x),那麼如何使用它進行分類呢:
F(x)=12log(p1p)F(x) = \frac{1}{2}log \left(\frac{p}{1-p} \right)
稍微進行轉化可得
e2F(x)=p1pe^{2F(x)} = \frac{p}{1-p}
進一步轉換可得
P+(x)=p=e2F(x)1+e2F(x)=11+e2F(x)P_{+}(x) = p = \frac{e^{2F(x)}}{1+e^{2F(x)}} = \frac{1}{1+e^{-2F(x)}}
P(x)=1p=11+e2F(x)P_{-}(x) = 1-p = \frac{1}{1+e^{2F(x)}}
最終實現二分類

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章