cs231n assignment2 dropout

代碼:https://github.com/LiuZhe6/CS231N

爲了防止神經網絡過擬合數據,可以採用dropout方法。其主要思想是:對隱藏層中部分輸出或者權重隨機置爲0。

Dropout

forward pass

題目要求使用inverted dropout,其主要思想是在訓練階段在mask的基礎上處以p,使得在測試階段不需要修改。
layers.py中dropout_forward()

    if mode == 'train':
        #######################################################################
        #       Implement training phase forward pass for inverted dropout.   #
        # Store the dropout mask in the mask variable.                        #
        #######################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        mask = (np.random.rand(*x.shape) < p) / p
        out = x * mask

        pass

        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        #######################################################################
        #                           END OF YOUR CODE                          #
        #######################################################################
    elif mode == 'test':
        #######################################################################
        #       Implement the test phase forward pass for inverted dropout.   #
        #######################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        out = x
        pass

        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        #######################################################################
        #                            END OF YOUR CODE                         #

backward pass

前向傳播過程中將mask保存下來了,故直接使用mask進行梯度計算。
layers.py中的dropout_backward()

    if mode == 'train':
        #######################################################################
        #       Implement training phase backward pass for inverted dropout   #
        #######################################################################
        # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        dx = mask * dout
        pass

        # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
        #######################################################################
        #                          END OF YOUR CODE                           #
        #######################################################################
    elif mode == 'test':
        dx = dout

Inline Question

Inline Question 1
What happens if we do not divide the values being passed through inverse dropout by p in the dropout layer? Why does that happen?
Answer
如果在訓練階段不除以p,則需要在測試時處以p。


Inline Question 2
Compare the validation and training accuracies with and without dropout – what do your results suggest about dropout as a regularizer?

Answer
使用了dropout的網絡,很明顯使訓練集準確率得到控制,避免了過擬合。


Inline Question 3
Suppose we are training a deep fully-connected network for image classification, with dropout after hidden layers (parameterized by keep probability p). If we are concerned about overfitting, how should we modify p (if at all) when we decide to decrease the size of the hidden layers (that is, the number of nodes in each layer)?

Answer
當減少網絡中的隱藏層時,我們應該儘可能減少dropout的程度,即增加p值。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章