神經網絡一般不對偏置項b進行正則化的原因

原創

2019-09-02 17:03

——————整理自《DeepLearning》Chapter 7.1。

對於神經網絡正則化，一般只對每一層仿射變換的weights進行正則化懲罰，而不對偏置bias進行正則化。

相比於weight，bias訓練準確需要的數據要更少。每個weight指定了兩個變量之間的關係。weights訓練準確需要在很多種情況下的同時觀察兩個變量。每個bias只控制一個變量。這意味着不對bias正則化，沒有引入很多方差（variance）。同時，對bias進行正則化容易引起欠擬合。

～～～～～～～～～～附原文片段如下～～～～～～～～～～～～
Before delving into the regularization behavior of different norms, we note that for neural networks, we typically choose to use a parameter norm penalty Ω that penalizes only the weights of the affine transformation at each layer and leaves the biases unregularized. The biases typically require less data to fit accurately than the weights. Each weight specifies how two variables interact. Fitting the weight well requires observing both variables in a variety of conditions. Each bias controls only a single variable. This means that we do not induce too much variance by leaving the biases unregularized.Also,regularizing the bias parameters can introduce a significant amount of underfitting.
～～～～～～～～～～～～～～～～～～～～～～～～～～～～～

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

神經網絡一般不對偏置項b進行正則化的原因

時間序列平穩性檢驗—R語言KPSS檢驗

pytorch實現多個模型的weights平均和修改weights

週期時間序列的傅里葉項：R-fourier()計算方法

變分自編碼器+要點綜述+代碼實現+生成圖片

LeetCode筆記：82. Remove Duplicates from Sorted List II 的普通解法

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結