TensorFlow 神經網絡優化：指數衰減學習率、滑動平均、正則化

1. 指數衰減學習率

tf.train.exponential_decay

先使用較大的學習率快速得到一個較優解，然後隨着迭代逐步減小學習率，使模型在訓練後期更加穩定。
$decayed\_learning\_rate = learning\_rate * decay\_rate ^{\frac{global\_step}{ decay\_steps}}$

global_step = tf.Variable(0,trainable=False)
learning_rate  = tf.train.exponential_decay(learning_rate, global_step, decay_steps, decay_rate, staircase=False)
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost,global_step = global_step)

2. 滑動平均

tf.train.ExponentialMovingAverage

滑動平均記錄了一段時間內模型中所有參數 $w$ 和 $b$ 各自的平均值。利用滑動平均值可以增強模型的泛化能力。
$shadow\_variable = decay \times shadow\_variable + (1-decay) \times variable$ 其中， $shadow\_variable$ 爲影子變量， $variable$ 爲待更新的變量， $decay$ 爲衰減率。 $decay$ 決定了模型的更新速度， $decay$ 越大，模型越趨於穩定。實際應用中，一般設定爲接近1的數（如0.999或0.9999）。爲了使模型在訓練前期可以更新的更快，ExponentialMovingAverage還提供了num_updates參數設置 $decay$ 的大小：
$\min \{ decay, \frac{1+num\_updates}{10+1+num\_updates} \}$ 使用如下：

variable_averages = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECCAY,global_step)
variable_averages_op = variable_averages.apply(tf.trainable_variables())
with tf.control_dependencies([optimizer,variable_averages_op]):
        train_op= tf.no_op(name = 'train')

3. 正則化

Tensorflow提供了

tf.contrib.layers.l1_regularizer
tf.contrib.layers.l2_regularizer

來計算給定參數的L1/L2正則化項的值。

使用如下：

regularizer = tf.contrib.layers.l2_regularizer(REGULARIZER_RATE) 
tf.add_to_collection('losses',regularizer(W1))
tf.add_to_collection('losses',regularizer(W2))
cem = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits = Y_, labels = Y))
cost = cem + tf.add_n(tf.get_collection('losses'))

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

TensorFlow 神經網絡優化：指數衰減學習率、滑動平均、正則化

1. 指數衰減學習率

2. 滑動平均

3. 正則化

DAPPER 事務 TRANSACTION

Python錯誤： NameError

TensorFlow 實現VGG16圖像分類

DeepLearning-L7-GoogLeNet

DeepLearning-L5-AlexNet

DeepLearning-L4-LeNet5

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結