理解softmax

ziL=kwkiLakL1+bkiL=Li=L1yjL=softmax(zjL)=ezjLieziL=LjL \begin{aligned} & z_{i}^{L}=\sum\nolimits_{k}{w_{ki}^{L}a_{k}^{L-1}+b_{ki}^{L}}=第L層第i個神經元的值=第L-1層所有神經元的加權輸出 \\ & y_{j}^{L}=softmax(z_{j}^{L})=\frac{{{e}^{z_{j}^{L}}}}{\sum\nolimits_{i}{{{e}^{z_{i}^{L}}}}} = \frac{第L層第j神經元的指數化}{第L層所有神經元指數化求和} \\ \end{aligned}

在這裏插入圖片描述
{if j=i, yjzi=zi(ezjLkezk)=(ezjL)kezkLezjezi(kezk)2=ezjLkezk(ezjLkezk)2=yj(1yj)if ji, yjzi=zi(ezjLkezk)=ezjL/zikezkLezjezi(kezk)2=0kezkLezjezi(kezk)2=yjyi \left\{ \begin{aligned} & if\ j=i,\ \frac{\partial y_{j}^{{}}}{\partial {{z}_{i}}}=\frac{\partial }{\partial {\color{red}{z}_{i}}}\left( \frac{{{e}^{z_{j}^{L}}}}{\sum\nolimits_{k}{{{e}^{z_{k}^{{}}}}}} \right)\text{=}\frac{{\color{red}({{{e}^{z_{j}^{L}}}{)}'}}\cdot \sum\nolimits_{k}{{{e}^{z_{k}^{L}}}}-{{e}^{z_{j}^{{}}}}\cdot {{e}^{z_{i}^{{}}}}}{{{\left( \sum\nolimits_{k}{{{e}^{z_{k}^{{}}}}} \right)}^{2}}}\text{=}\frac{{{e}^{z_{j}^{L}}}}{\sum\nolimits_{k}{{{e}^{z_{k}^{{}}}}}}-{{\left( \frac{{{e}^{z_{j}^{L}}}}{\sum\nolimits_{k}{{{e}^{z_{k}^{{}}}}}} \right)}^{2}}=\color{red}{{y}_{j}}(1-{{y}_{j}}) \\ & if\ j\ne i,\ \frac{\partial y_{j}^{{}}}{\partial {{z}_{i}}}=\frac{\partial }{\color{red}\partial {{z}_{i}}}\left( \frac{{{e}^{z_{j}^{L}}}}{\sum\nolimits_{k}{{{e}^{z_{k}^{{}}}}}} \right)\text{=}\frac{{}^{\color{red}{\partial {{e}^{z_{j}^{L}}}}/{}_{\partial {{z}_{i}}}\cdot} \sum\nolimits_{k}{{{e}^{z_{k}^{L}}}}-{{e}^{z_{j}^{{}}}}\cdot {{e}^{z_{i}^{{}}}}}{{{\left( \sum\nolimits_{k}{{{e}^{z_{k}^{{}}}}} \right)}^{2}}}\text{=}\frac{{\color{red}0}\cdot \sum\nolimits_{k}{{{e}^{z_{k}^{L}}}}-{{e}^{z_{j}^{{}}}}\cdot {{e}^{z_{i}^{{}}}}}{{{\left( \sum\nolimits_{k}{{{e}^{z_{k}^{{}}}}} \right)}^{2}}}=\color{red}-{{y}_{j}}{{y}_{i}} \\ \end{aligned} \right.

最終softmax函數的在yj{{y}_{j}}zi{{z}_{i}}上的反響傳播這條線上的導數分別爲:
yjzi={yj(1yj)j=iyjyiji \color{red}{ \frac{\partial y_{j}^{{}}}{\partial {{z}_{i}}}=\left\{ \begin{matrix} {{y}_{j}}(1-{{y}_{j}}) & j=i \\ -{{y}_{j}}{{y}_{i}} & j\ne i \\ \end{matrix} \right.}

【注意】所有這裏區別就在於 當jij \ne i時,分子有一個導數直接爲0。

Reference

交叉熵代價函數(作用及公式推導)

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章