参考 https://zhuanlan.zhihu.com/p/31272216
1、What is the “cache” used for in our implementation of forward propagation and backward propagation?
实现前向传播和反向传播中使用的“cache”是什么?
-
It is used to cache the intermediate values of the cost function during training.
用于在训练时缓存成本函数的中间值。 -
We use it to pass variables computed during forward propagation to the corresponding backward propagation step. It contains useful values for backward propagation to compute derivatives.
我们用它传递前向传播中计算的变量到相应的反向传播步骤,它包含用于计算导数的反向传播的有用值。(正确) -
It is used to keep track of the hyperparameters that we are searching over, to speed up computation.
它用于跟踪我们正在搜索的超参数,以加速计算 -
We use it to pass variables computed during backward propagation to the corresponding forward propagation step. It contains useful values for forward propagation to compute activations.
我们使用它将向后传播计算的变量传递给相应的正向传播步骤,它包含用于计算计算激活的正向传播的有用值。
================================================================================
2、Among the following, which ones are “hyperparameters”? (Check all that apply.)
以下哪些是超参数
- number of iterations 迭代次数,是
- bias vectors 偏移向量b
- size of the hidden layers n[l] 隐藏层的大小,是
- weight matrices 权重矩阵
- learning rate α 学习率,是
- number of layers L in the neural network NN中的层数,是
================================================================================
3、Which of the following statements is true? 下面哪句是正确的?
- The deeper layers of a neural network are typically computing more complex features of the input than the earlier layers.
NN的更深层通常比前面的层计算更复杂的输入特征。正确 - The earlier layers of a neural network are typically computing more complex features of the input than the deeper layers.
NN的前面的层通常比更深层计算输入的更复杂的特性。
对于深层网络,越后面的计算越复杂
================================================================================
4、Vectorization allows you to compute forward propagation in an L-layer neural network without an explicit for-loop (or any other explicit iterative loop) over the layers l=1, 2, …,L. True/False?
向量化允许您在L层NN中计算前向传播,而不需要在层(l = 1,2,…,L)上显式的使用for-loop(或任何其他显式迭代循环),正确吗?
错误。
对于正向传播,要遍历每一层是需要有一个显示循环的
================================================================================
5、Assume we store the values for in an array called layers, as follows: layer_dims = [n_x, 4,3,2,1]. So layer 1 has four hidden units, layer 2 has 3 hidden units and so on. Which of the following for-loops will allow you to initialize the parameters for the model?
假设我们将的值存储在名为layers的数组中,如下所示:layer_dims = [n_x,4,3,2,1]。 因此,第1层有四个隐藏单元,第2层有三个隐藏单元,依此类推。 您可以使用以下哪个for循环初始化模型参数?
- A
for(i in range(1, len(layer_dims)/2)):
parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) *0.01
parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01
- B
for(i in range(1, len(layer_dims)/2)):
parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) *0.01
parameter[‘b’ + str(i)] = np.random.randn(layers[i-1], 1) * 0.01
- C
for(i in range(1, len(layer_dims))):
parameter[‘W’ + str(i)] = np.random.randn(layers[i-1], layers[i])) *0.01
parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01
- D 正确
for(i in range(1, len(layer_dims))):
parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) *0.01
parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01
关于参数初始化,参见https://blog.csdn.net/weixin_42555985/article/details/106041046
================================================================================
6 、Consider the following neural network.
请问NN有多少层?
- 总层数L是4,隐藏层3层。正确
- 总层数L是3,隐藏层3层
- 总层数L是4,隐藏层4层
- 总层数L是5,隐藏层3层
As seen in lecture, the number of layers is counted as the number of hidden layers + 1. The input and output layers are not counted as hidden layers.
我们已经学习过,NN的总层数是隐藏层+1,输入层和输出层都不是隐藏层,输入层不计入总层数,输出层计入总层数。
================================================================================
7、During forward propagation, in the forward function for a layer l you need to know what is the activation function in a layer (Sigmoid, tanh, ReLU, etc.). During backpropagation, the corresponding backward function also needs to know what is the activation function for layer l, since the gradient depends on it. True/False?
在前向传播期间,在层l的前向传播函数中,您需要知道层中的激活函数(Sigmoid,tanh,ReLU等)是什么, 在反向传播期间,相应的反向传播函数也需要知道第层的激活函数是什么,因为梯度是根据它来计算的,正确吗?
正确
During backpropagation you need to know which activation was used in the forward propagation to be able to compute the correct derivative.
================================================================================
8、There are certain functions with the following properties:
(i) To compute the function using a shallow network circuit, you will need a large network (where we measure size by the number of logic gates in the network), but (ii) To compute it using a deep network circuit, you need only an exponentially smaller network. True/False?
(i) 使用浅层网络电路计算函数时,需要一个大网络(我们通过网络中的逻辑门数量来度量大小),但是(ii)使用深网络电路来计算它,只需要一个指数较小的网络。真/假?
正确。
因为层数少,节点数就会指数级增长。
================================================================================
9、Consider the following 2 hidden layer neural network: 考虑2个隐藏层的网络
Which of the following statements are True? (Check all that apply). 下面哪个正确
注意:这里只展示了正确的答案。
- W[1] will have shape (4, 4)
- b[1] will have shape (4, 1)
- W[2] will have shape (3, 4)
- b[2] will have shape (3, 1)
- b[3] will have shape (1, 1)
- W[3] will have shape (1, 3)
================================================================================
10、Whereas the previous question used a specific network, in the general case what is the dimension of W[l], the weight matrix associated with layer l?
前面的问题使用了一个特定的网络,与层有关的权重矩阵在一般情况下,W[1]的维数是多少?
- W[l]的维度是 (n[l],n[l−1])。正确
- W[l]的维度是 (n[l-1],n[l])
- W[l]的维度是 (n[l+1],n[l])
- W[l]的维度是 (n[l],n[l+1])