神經網絡一般不對偏置項b進行正則化的原因

——————整理自 《DeepLearning》Chapter 7.1。

對於神經網絡正則化,一般只對每一層仿射變換的weights進行正則化懲罰,而不對偏置bias進行正則化。

相比於weight,bias訓練準確需要的數據要更少。每個weight指定了兩個變量之間的關係。weights訓練準確需要在很多種情況下的同時觀察兩個變量。每個bias只控制一個變量。這意味着不對bias正則化,沒有引入很多方差(variance)。同時,對bias進行正則化容易引起欠擬合。

~~~~~~~~~~附原文片段如下~~~~~~~~~~~~
Before delving into the regularization behavior of different norms, we note that for neural networks, we typically choose to use a parameter norm penalty Ω that penalizes only the weights of the affine transformation at each layer and leaves the biases unregularized. The biases typically require less data to fit accurately than the weights. Each weight specifies how two variables interact. Fitting the weight well requires observing both variables in a variety of conditions. Each bias controls only a single variable. This means that we do not induce too much variance by leaving the biases unregularized.Also,regularizing the bias parameters can introduce a significant amount of underfitting.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章