調參之random initialization

原創

大眼呆萌君

2020-02-23 11:19

Big picture on why we need randomness in stochastic algorithms

randomness during initialization: as the structure of the search space is unknown
randomness during the progression of the search: avoid getting stuck in local optima
– This is achieved in SGD or mini-batch gradient descent

備：examples of stochastic algorithms – stochastic gradient descent, genetic algorithms, simulated annealing

Principal and methods for random initialization

Principal: initialize the weights of neural networks to some random but close-to-zero values, such as in [0,0.1].

Methods: see https://keras.io/initializers/.

Reason for effectiveness of random initialization

If two hidden units with the same activation function are connected to the same inputs and have the same initial parameters, then a deterministic learning algorithm applied to a deterministic cost and model will constantly update both of these units in the same way. These units must have different initial parameters to “ $\textcolor{green}{\text{\small break symmetry}}$ ”.

Evaluation of neural networks with random initialization

The most effective way to evaluate the performance of a neural network configuration is to repeat the search process multiple times and report the average performance of the model over those repeats. This gives the configuration the best chance to search the space from multiple different sets of initial conditions. Sometimes this is called a $\textcolor{green}{\text{\small multiple restart}}$ or $\textcolor{green}{\text{\small multiple-restart search}}$ .

參考文獻

Why Initialize a Neural Network with Random Weights?
https://machinelearningmastery.com/why-initialize-a-neural-network-with-random-weights/

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

調參之random initialization

Big picture on why we need randomness in stochastic algorithms

Principal and methods for random initialization

Reason for effectiveness of random initialization

Evaluation of neural networks with random initialization

容器中nginx無法使用同一個網絡下的容器域名

Python: SunMoonTimeCalculator

「Pygors跨平臺GUI」1：Pygors跨平臺GUI應用研究

NETCore中實現一個輕量無負擔的極簡任務調度ScheduleTask

docker使用特定的網絡

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

「Pygors跨平臺GUI」2：安裝MinGW-w64、MSYS2還是WSL2

nodejs學習07——API

避免DbContext同時在多個線程調用

GPT-4o 引領人機交互新風向，向量數據庫賽道沸騰了

梯度下降、隨機梯度下降法、及其改進

機器學習中的凸和非凸優化問題

L1正則項與稀疏性

驗證梯度的正確性

Deep Learning相關概念

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結