Learning Rate Annealing
原創
银云风
2018-08-27 16:14
學習率退火
- “learning_rate”:學習率
- “learning_rate_a”和”learning_rate_b”:學習率衰減參數,具體衰減公式由learning_rate_schedule決定
- “learning_rate_schedule”:配置不同的學習率遞減模式,包括:
- ”constant”: lr = learning_rate
- “poly”: lr = learning_rate * pow(1 + learning_rate_decay_a * num_samples_processed, -learning_rate_decay_b)
- “exp”: lr = learning_rate * pow(learning_rate_decay_a, num_samples_processed / learning_rate_decay_b)
- “discexp”: lr = learning_rate * pow(learning_rate_decay_a, floor(num_samples_processed / learning_rate_decay_b))
- “linear”: lr = max(learning_rate - learning_rate_decay_a * num_sample_passed, learning_rate_decay_b)