支持向量機用於分類:硬間隔和軟件間隔支持向量機。儘可能分對
支持向量機迴歸: 希望\(f(x)\)與\(y\)儘可能的接近。
支持向量機基本思想
英文名:support vector regression
簡記:SVR
標準的線性支持向量迴歸模型
學習的模型:
\[f(x)=w^Tx+b\]
假設能容忍\(f(x)\)與\(y\)之間差別絕對值\(\xi\),這就以\(f(x)=w^Tx+b\)形成了一個\(2\xi\)的間隔帶,因此模型
\[
\min \frac{1}{2}w^Tw\\
s.t -\xi<=f(x_i)-y_i<=\xi
\]
但是上述條件太過嚴苛,因此增加懲罰項,
\[
\min \frac{1}{2}w^Tw+C\sum(\epsilon_i+\hat{\epsilon}_i)\\
s.t. \begin{cases}f(x_i)-y_i<=\xi+\epsilon_i\\
y_i-f(x_i)<=\xi+\hat{\epsilon}_i\\
\hat{\epsilon}_i>=0,\epsilon_i>=0
\end{cases}
\]
構造Lagrange函數
\[
\begin{aligned} L :=\frac{1}{2}\|\omega\|^{2} &+C \sum\left(\xi_i+\xi^{\prime}_i\right)-\sum_{i=1}^{N}\left(\eta_{i} \xi_{i}+\eta_{i}^{'} \xi_{i}6{'}\right) \\ &+\sum \alpha_{i}\left(y_{i}-\omega^{T} x_{i}-b-\varepsilon-\xi_{i}\right) \\ &+\sum \alpha_{i}^{'}\left(\omega^{T} x_{i}+b-y_{i}-\varepsilon-\xi_{i}^{\prime}\right) \end{aligned}\tag{1}
\]
求偏導
\[
\frac{\partial L}{\partial \omega}=\omega-\sum\left(\alpha_{i}-\alpha_{i}\right) x_{i}=0 \Rightarrow \omega=\sum\left(\alpha_{i}-\alpha_{i}^{\prime}\right) x_{i}\tag{2}
\]
\[ \frac{\partial L}{\partial b}=\sum_{i=1}^{N}\left(\alpha_{i}-\alpha_{i}^{\prime}\right)=0 \tag{3} \]
\[ \frac{\partial L}{\partial \xi_{i}^{\prime}}=C-\alpha_{i}^{'}-\eta_{i}^{\prime}=0 \tag{4} \]
\[ \frac{\partial L}{\partial \xi_{i}}=C-\alpha_{i}-\eta_{i}=0 \tag{5} \]
將(2)-(4)帶回(1),可得對偶問題
\[
\begin{aligned} \min L(\boldsymbol{\alpha})=& \frac{1}{2} \sum_{i=1}^{N} \sum_{j=1}^{N}\left(\alpha_{i}-\alpha_{i}^{*}\right)\left(\alpha_{j}-\alpha_{j}^{*}\right)\left\langle x_{i}, x_{j}\right\rangle \\ &+\varepsilon \sum_{i=1}^{N}\left(\alpha_{i}+\alpha_{i}^{*}\right)-\sum_{i=1}^{N} y_{i}\left(\alpha_{i}-\alpha_{i}^{*}\right) \\ \text { s.t. } & \sum_{n=1}^{N}\left(\alpha_{n}-\alpha_{n}^{*}\right)=0 \end{aligned}
\]
再將(2)帶回\(Y=w^Tx+b\),可得線性迴歸模型
\[
y(x)=\sum_{i=1}^{N}\left(\alpha_{i}-\alpha_{i}^{*}\right) x_{i}^{T} x+b
\]
非線性支持向量機
考慮模型
\[
y=f(x)+b
\]
\(f(x)\)是非線性函數,存在一個由\(X\)所在空間到希爾伯特空間的映射,使得
\[
f(x)=w^T\varphi(x)
\]
因此,建立如下的優化問題
\[
\min \frac{1}{2}\|\omega\|^{T}+C \sum_{i}\left(\xi_{i}+\xi_{i}^{\prime}\right)\\
\begin{cases} y\left(x_{i}\right)-\omega^{T} \varphi\left(x_{i}\right)-b \leq \xi_{i} \\ \omega^{T} \varphi\left(x_{i}\right)+b-y\left(x_{i}\right) & \leq \xi_{i} \\ \xi_{i} & \geq 0 \\ \xi_{i} & \geq 0 \end{cases}
\]
構造lagrange函數
\[
\begin{aligned} L :=\frac{1}{2}\|\omega\|^{2} &+C \sum\left(\xi+\xi^{\prime}\right)-\sum\left(\eta_{i} \xi_{i}+\eta_{i} \xi_{i}^{\prime}\right) \\ &+\sum \alpha_{i}\left(y_{i}-w^{T} \varphi\left(x_{i}\right)-b-\varepsilon_{i}-\xi_{i}\right) \\ &+\sum \alpha_{\mathrm{i}}^{\prime}\left(w^{T} \varphi\left(x_{i}\right)+b-y_{i}-\varepsilon_{i}^{'}-\xi_{i}^{\prime}\right) \end{aligned}
\]
求偏導
\[
\begin{cases}\frac{\partial L}{\partial w}=w-\sum\left(\alpha_{i}-\alpha_{i}\right) \varphi\left(x_{i}\right)=0\\
\frac{\partial L}{\partial b} =\sum\left(\alpha_{i}-\alpha_{i}^{\prime}\right)=0 \\ \frac{\partial L}{\partial \xi_{i}^{\prime}} =C-\alpha_{i}^{'}-\eta_{i}^{\prime}=0 \\ \frac{\partial L}{\partial \xi_{i}} =C-\alpha_{i}-\eta_{i}=0 \end{cases}
\]
再帶回優化問題可得
\[\min _{t}-\frac{1}{2} \sum\left(\alpha_{i}-\alpha_{i}^{\prime}\right)\left(\alpha_{j}-\alpha_{j}^{\prime}\right) \varphi\left(x_{i}\right)^{T} \varphi\left(x_{j}\right)-\varepsilon \sum\left(\alpha_{i}+\alpha_{i}^{\prime}\right)+\sum y_{i}\left(\alpha_{i}-\alpha_{i}^{'}\right)\\s t . \sum\left(\alpha_{i}-\alpha_{i}^{\prime}\right)=0\]
再次將\(w\)帶回模型
\[
y=\sum\left(\alpha_{i}-\alpha_{i}^{'}\right) \varphi\left(x_{i}\right)^{T} \varphi(x)+b
\]