2020-6-21 吴恩达-NN&DL-w4 深层NN(课后作业)

参考 https://zhuanlan.zhihu.com/p/31272216

1、What is the “cache” used for in our implementation of forward propagation and backward propagation?
实现前向传播和反向传播中使用的“cache”是什么?

  • It is used to cache the intermediate values of the cost function during training.
    用于在训练时缓存成本函数的中间值。

  • We use it to pass variables computed during forward propagation to the corresponding backward propagation step. It contains useful values for backward propagation to compute derivatives.
    我们用它传递前向传播中计算的变量到相应的反向传播步骤,它包含用于计算导数的反向传播的有用值。(正确)

  • It is used to keep track of the hyperparameters that we are searching over, to speed up computation.
    它用于跟踪我们正在搜索的超参数,以加速计算

  • We use it to pass variables computed during backward propagation to the corresponding forward propagation step. It contains useful values for forward propagation to compute activations.
    我们使用它将向后传播计算的变量传递给相应的正向传播步骤,它包含用于计算计算激活的正向传播的有用值。

================================================================================

2、Among the following, which ones are “hyperparameters”? (Check all that apply.)
以下哪些是超参数

  • number of iterations 迭代次数,是
  • bias vectors b[l]b^{[l]}偏移向量b
  • size of the hidden layers n[l] 隐藏层的大小,是
  • weight matrices W[l]W^{[l]}权重矩阵
  • learning rate α 学习率,是
  • number of layers L in the neural network NN中的层数,是

================================================================================

3、Which of the following statements is true? 下面哪句是正确的?

  • The deeper layers of a neural network are typically computing more complex features of the input than the earlier layers.
    NN的更深层通常比前面的层计算更复杂的输入特征。正确
  • The earlier layers of a neural network are typically computing more complex features of the input than the deeper layers.
    NN的前面的层通常比更深层计算输入的更复杂的特性。

对于深层网络,越后面的计算越复杂

================================================================================

4、Vectorization allows you to compute forward propagation in an L-layer neural network without an explicit for-loop (or any other explicit iterative loop) over the layers l=1, 2, …,L. True/False?
向量化允许您在L层NN中计算前向传播,而不需要在层(l = 1,2,…,L)上显式的使用for-loop(或任何其他显式迭代循环),正确吗?

错误。

对于正向传播,要遍历每一层是需要有一个显示循环的

================================================================================

5、Assume we store the values for n[l]n^{[l]} in an array called layers, as follows: layer_dims = [n_x, 4,3,2,1]. So layer 1 has four hidden units, layer 2 has 3 hidden units and so on. Which of the following for-loops will allow you to initialize the parameters for the model?
假设我们将n[l]n^{[l]}的值存储在名为layers的数组中,如下所示:layer_dims = [n_x,4,3,2,1]。 因此,第1层有四个隐藏单元,第2层有三个隐藏单元,依此类推。 您可以使用以下哪个for循环初始化模型参数?

  • A
for(i in range(1, len(layer_dims)/2)):
    parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) *0.01
    parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01
  • B
for(i in range(1, len(layer_dims)/2)):
    parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) *0.01
    parameter[‘b’ + str(i)] = np.random.randn(layers[i-1], 1) * 0.01
  • C
for(i in range(1, len(layer_dims))):
    parameter[‘W’ + str(i)] = np.random.randn(layers[i-1], layers[i])) *0.01
    parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01
  • D 正确
for(i in range(1, len(layer_dims))):
    parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) *0.01
    parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01

关于参数初始化,参见https://blog.csdn.net/weixin_42555985/article/details/106041046

================================================================================

6 、Consider the following neural network.
在这里插入图片描述
请问NN有多少层?

  • 总层数L是4,隐藏层3层。正确
  • 总层数L是3,隐藏层3层
  • 总层数L是4,隐藏层4层
  • 总层数L是5,隐藏层3层

As seen in lecture, the number of layers is counted as the number of hidden layers + 1. The input and output layers are not counted as hidden layers.

我们已经学习过,NN的总层数是隐藏层+1,输入层和输出层都不是隐藏层,输入层不计入总层数,输出层计入总层数。

================================================================================

7、During forward propagation, in the forward function for a layer l you need to know what is the activation function in a layer (Sigmoid, tanh, ReLU, etc.). During backpropagation, the corresponding backward function also needs to know what is the activation function for layer l, since the gradient depends on it. True/False?

在前向传播期间,在层l的前向传播函数中,您需要知道层ll中的激活函数(Sigmoid,tanh,ReLU等)是什么, 在反向传播期间,相应的反向传播函数也需要知道第ll层的激活函数是什么,因为梯度是根据它来计算的,正确吗?

正确

During backpropagation you need to know which activation was used in the forward propagation to be able to compute the correct derivative.

================================================================================

8、There are certain functions with the following properties:

(i) To compute the function using a shallow network circuit, you will need a large network (where we measure size by the number of logic gates in the network), but (ii) To compute it using a deep network circuit, you need only an exponentially smaller network. True/False?

(i) 使用浅层网络电路计算函数时,需要一个大网络(我们通过网络中的逻辑门数量来度量大小),但是(ii)使用深网络电路来计算它,只需要一个指数较小的网络。真/假?

正确。
因为层数少,节点数就会指数级增长。

================================================================================

9、Consider the following 2 hidden layer neural network: 考虑2个隐藏层的网络
在这里插入图片描述

Which of the following statements are True? (Check all that apply). 下面哪个正确

注意:这里只展示了正确的答案。

  • W[1] will have shape (4, 4)
  • b[1] will have shape (4, 1)
  • W[2] will have shape (3, 4)
  • b[2] will have shape (3, 1)
  • b[3] will have shape (1, 1)
  • W[3] will have shape (1, 3)

================================================================================

10、Whereas the previous question used a specific network, in the general case what is the dimension of W[l], the weight matrix associated with layer l?

前面的问题使用了一个特定的网络,与层ll有关的权重矩阵在一般情况下,W[1]的维数是多少?

  • W[l]的维度是 (n[l],n[l−1])。正确
  • W[l]的维度是 (n[l-1],n[l])
  • W[l]的维度是 (n[l+1],n[l])
  • W[l]的维度是 (n[l],n[l+1])
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章