神经网络一般不对偏置项b进行正则化的原因

——————整理自 《DeepLearning》Chapter 7.1。

对于神经网络正则化,一般只对每一层仿射变换的weights进行正则化惩罚,而不对偏置bias进行正则化。

相比于weight,bias训练准确需要的数据要更少。每个weight指定了两个变量之间的关系。weights训练准确需要在很多种情况下的同时观察两个变量。每个bias只控制一个变量。这意味着不对bias正则化,没有引入很多方差(variance)。同时,对bias进行正则化容易引起欠拟合。

~~~~~~~~~~附原文片段如下~~~~~~~~~~~~
Before delving into the regularization behavior of different norms, we note that for neural networks, we typically choose to use a parameter norm penalty Ω that penalizes only the weights of the affine transformation at each layer and leaves the biases unregularized. The biases typically require less data to fit accurately than the weights. Each weight specifies how two variables interact. Fitting the weight well requires observing both variables in a variety of conditions. Each bias controls only a single variable. This means that we do not induce too much variance by leaving the biases unregularized.Also,regularizing the bias parameters can introduce a significant amount of underfitting.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章