2020-6-21 吳恩達-NN&DL-w4 深層NN(課後作業)

參考 https://zhuanlan.zhihu.com/p/31272216

1、What is the “cache” used for in our implementation of forward propagation and backward propagation?
實現前向傳播和反向傳播中使用的“cache”是什麼?

  • It is used to cache the intermediate values of the cost function during training.
    用於在訓練時緩存成本函數的中間值。

  • We use it to pass variables computed during forward propagation to the corresponding backward propagation step. It contains useful values for backward propagation to compute derivatives.
    我們用它傳遞前向傳播中計算的變量到相應的反向傳播步驟,它包含用於計算導數的反向傳播的有用值。(正確)

  • It is used to keep track of the hyperparameters that we are searching over, to speed up computation.
    它用於跟蹤我們正在搜索的超參數,以加速計算

  • We use it to pass variables computed during backward propagation to the corresponding forward propagation step. It contains useful values for forward propagation to compute activations.
    我們使用它將向後傳播計算的變量傳遞給相應的正向傳播步驟,它包含用於計算計算激活的正向傳播的有用值。

================================================================================

2、Among the following, which ones are “hyperparameters”? (Check all that apply.)
以下哪些是超參數

  • number of iterations 迭代次數,是
  • bias vectors b[l]b^{[l]}偏移向量b
  • size of the hidden layers n[l] 隱藏層的大小,是
  • weight matrices W[l]W^{[l]}權重矩陣
  • learning rate α 學習率,是
  • number of layers L in the neural network NN中的層數,是

================================================================================

3、Which of the following statements is true? 下面哪句是正確的?

  • The deeper layers of a neural network are typically computing more complex features of the input than the earlier layers.
    NN的更深層通常比前面的層計算更復雜的輸入特徵。正確
  • The earlier layers of a neural network are typically computing more complex features of the input than the deeper layers.
    NN的前面的層通常比更深層計算輸入的更復雜的特性。

對於深層網絡,越後面的計算越複雜

================================================================================

4、Vectorization allows you to compute forward propagation in an L-layer neural network without an explicit for-loop (or any other explicit iterative loop) over the layers l=1, 2, …,L. True/False?
向量化允許您在L層NN中計算前向傳播,而不需要在層(l = 1,2,…,L)上顯式的使用for-loop(或任何其他顯式迭代循環),正確嗎?

錯誤。

對於正向傳播,要遍歷每一層是需要有一個顯示循環的

================================================================================

5、Assume we store the values for n[l]n^{[l]} in an array called layers, as follows: layer_dims = [n_x, 4,3,2,1]. So layer 1 has four hidden units, layer 2 has 3 hidden units and so on. Which of the following for-loops will allow you to initialize the parameters for the model?
假設我們將n[l]n^{[l]}的值存儲在名爲layers的數組中,如下所示:layer_dims = [n_x,4,3,2,1]。 因此,第1層有四個隱藏單元,第2層有三個隱藏單元,依此類推。 您可以使用以下哪個for循環初始化模型參數?

  • A
for(i in range(1, len(layer_dims)/2)):
    parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) *0.01
    parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01
  • B
for(i in range(1, len(layer_dims)/2)):
    parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) *0.01
    parameter[‘b’ + str(i)] = np.random.randn(layers[i-1], 1) * 0.01
  • C
for(i in range(1, len(layer_dims))):
    parameter[‘W’ + str(i)] = np.random.randn(layers[i-1], layers[i])) *0.01
    parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01
  • D 正確
for(i in range(1, len(layer_dims))):
    parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) *0.01
    parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01

關於參數初始化,參見https://blog.csdn.net/weixin_42555985/article/details/106041046

================================================================================

6 、Consider the following neural network.
在這裏插入圖片描述
請問NN有多少層?

  • 總層數L是4,隱藏層3層。正確
  • 總層數L是3,隱藏層3層
  • 總層數L是4,隱藏層4層
  • 總層數L是5,隱藏層3層

As seen in lecture, the number of layers is counted as the number of hidden layers + 1. The input and output layers are not counted as hidden layers.

我們已經學習過,NN的總層數是隱藏層+1,輸入層和輸出層都不是隱藏層,輸入層不計入總層數,輸出層計入總層數。

================================================================================

7、During forward propagation, in the forward function for a layer l you need to know what is the activation function in a layer (Sigmoid, tanh, ReLU, etc.). During backpropagation, the corresponding backward function also needs to know what is the activation function for layer l, since the gradient depends on it. True/False?

在前向傳播期間,在層l的前向傳播函數中,您需要知道層ll中的激活函數(Sigmoid,tanh,ReLU等)是什麼, 在反向傳播期間,相應的反向傳播函數也需要知道第ll層的激活函數是什麼,因爲梯度是根據它來計算的,正確嗎?

正確

During backpropagation you need to know which activation was used in the forward propagation to be able to compute the correct derivative.

================================================================================

8、There are certain functions with the following properties:

(i) To compute the function using a shallow network circuit, you will need a large network (where we measure size by the number of logic gates in the network), but (ii) To compute it using a deep network circuit, you need only an exponentially smaller network. True/False?

(i) 使用淺層網絡電路計算函數時,需要一個大網絡(我們通過網絡中的邏輯門數量來度量大小),但是(ii)使用深網絡電路來計算它,只需要一個指數較小的網絡。真/假?

正確。
因爲層數少,節點數就會指數級增長。

================================================================================

9、Consider the following 2 hidden layer neural network: 考慮2個隱藏層的網絡
在這裏插入圖片描述

Which of the following statements are True? (Check all that apply). 下面哪個正確

注意:這裏只展示了正確的答案。

  • W[1] will have shape (4, 4)
  • b[1] will have shape (4, 1)
  • W[2] will have shape (3, 4)
  • b[2] will have shape (3, 1)
  • b[3] will have shape (1, 1)
  • W[3] will have shape (1, 3)

================================================================================

10、Whereas the previous question used a specific network, in the general case what is the dimension of W[l], the weight matrix associated with layer l?

前面的問題使用了一個特定的網絡,與層ll有關的權重矩陣在一般情況下,W[1]的維數是多少?

  • W[l]的維度是 (n[l],n[l−1])。正確
  • W[l]的維度是 (n[l-1],n[l])
  • W[l]的維度是 (n[l+1],n[l])
  • W[l]的維度是 (n[l],n[l+1])
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章