神经网络一般不对偏置项b进行正则化的原因

原創

2019-09-02 17:03

——————整理自《DeepLearning》Chapter 7.1。

对于神经网络正则化，一般只对每一层仿射变换的weights进行正则化惩罚，而不对偏置bias进行正则化。

相比于weight，bias训练准确需要的数据要更少。每个weight指定了两个变量之间的关系。weights训练准确需要在很多种情况下的同时观察两个变量。每个bias只控制一个变量。这意味着不对bias正则化，没有引入很多方差（variance）。同时，对bias进行正则化容易引起欠拟合。

～～～～～～～～～～附原文片段如下～～～～～～～～～～～～
Before delving into the regularization behavior of different norms, we note that for neural networks, we typically choose to use a parameter norm penalty Ω that penalizes only the weights of the affine transformation at each layer and leaves the biases unregularized. The biases typically require less data to fit accurately than the weights. Each weight specifies how two variables interact. Fitting the weight well requires observing both variables in a variety of conditions. Each bias controls only a single variable. This means that we do not induce too much variance by leaving the biases unregularized.Also,regularizing the bias parameters can introduce a significant amount of underfitting.
～～～～～～～～～～～～～～～～～～～～～～～～～～～～～

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

神经网络一般不对偏置项b进行正则化的原因

時間序列平穩性檢驗—R語言KPSS檢驗

pytorch實現多個模型的weights平均和修改weights

週期時間序列的傅里葉項：R-fourier()計算方法

變分自編碼器+要點綜述+代碼實現+生成圖片

LeetCode筆記：82. Remove Duplicates from Sorted List II 的普通解法

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結