2020-6-21 吴恩达-NN&DL-w4 深层NN(课后作业)

参考 https://zhuanlan.zhihu.com/p/31272216

1、What is the “cache” used for in our implementation of forward propagation and backward propagation?
实现前向传播和反向传播中使用的“cache”是什么？

It is used to cache the intermediate values of the cost function during training.
用于在训练时缓存成本函数的中间值。
We use it to pass variables computed during forward propagation to the corresponding backward propagation step. It contains useful values for backward propagation to compute derivatives.
我们用它传递前向传播中计算的变量到相应的反向传播步骤，它包含用于计算导数的反向传播的有用值。（正确）
It is used to keep track of the hyperparameters that we are searching over, to speed up computation.
它用于跟踪我们正在搜索的超参数，以加速计算
We use it to pass variables computed during backward propagation to the corresponding forward propagation step. It contains useful values for forward propagation to compute activations.
我们使用它将向后传播计算的变量传递给相应的正向传播步骤，它包含用于计算计算激活的正向传播的有用值。

================================================================================

2、Among the following, which ones are “hyperparameters”? (Check all that apply.)
以下哪些是超参数

number of iterations 迭代次数，是
bias vectors $b^{[l]}$ 偏移向量b
size of the hidden layers n[l] 隐藏层的大小，是
weight matrices $W^{[l]}$ 权重矩阵
learning rate α 学习率，是
number of layers L in the neural network NN中的层数，是

================================================================================

3、Which of the following statements is true? 下面哪句是正确的？

The deeper layers of a neural network are typically computing more complex features of the input than the earlier layers.
NN的更深层通常比前面的层计算更复杂的输入特征。正确
The earlier layers of a neural network are typically computing more complex features of the input than the deeper layers.
NN的前面的层通常比更深层计算输入的更复杂的特性。

对于深层网络，越后面的计算越复杂

================================================================================

4、Vectorization allows you to compute forward propagation in an L-layer neural network without an explicit for-loop (or any other explicit iterative loop) over the layers l=1, 2, …,L. True/False?
向量化允许您在L层NN中计算前向传播，而不需要在层(l = 1,2，…，L)上显式的使用for-loop（或任何其他显式迭代循环），正确吗？

错误。

对于正向传播，要遍历每一层是需要有一个显示循环的

================================================================================

5、Assume we store the values for $n^{[l]}$ in an array called layers, as follows: layer_dims = [n_x, 4,3,2,1]. So layer 1 has four hidden units, layer 2 has 3 hidden units and so on. Which of the following for-loops will allow you to initialize the parameters for the model?
假设我们将 $n^{[l]}$ 的值存储在名为layers的数组中，如下所示：layer_dims = [n_x,4,3,2,1]。因此，第1层有四个隐藏单元，第2层有三个隐藏单元，依此类推。您可以使用以下哪个for循环初始化模型参数？

for(i in range(1, len(layer_dims)/2)):
    parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) *0.01
    parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01

for(i in range(1, len(layer_dims)/2)):
    parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) *0.01
    parameter[‘b’ + str(i)] = np.random.randn(layers[i-1], 1) * 0.01

for(i in range(1, len(layer_dims))):
    parameter[‘W’ + str(i)] = np.random.randn(layers[i-1], layers[i])) *0.01
    parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01

D 正确

for(i in range(1, len(layer_dims))):
    parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i - 1])) *0.01
    parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01

关于参数初始化，参见https://blog.csdn.net/weixin_42555985/article/details/106041046

================================================================================

6 、Consider the following neural network.

请问NN有多少层？

总层数L是4，隐藏层3层。正确
总层数L是3，隐藏层3层
总层数L是4，隐藏层4层
总层数L是5，隐藏层3层

As seen in lecture, the number of layers is counted as the number of hidden layers + 1. The input and output layers are not counted as hidden layers.

我们已经学习过，NN的总层数是隐藏层+1，输入层和输出层都不是隐藏层，输入层不计入总层数，输出层计入总层数。

================================================================================

7、During forward propagation, in the forward function for a layer l you need to know what is the activation function in a layer (Sigmoid, tanh, ReLU, etc.). During backpropagation, the corresponding backward function also needs to know what is the activation function for layer l, since the gradient depends on it. True/False?

在前向传播期间，在层l的前向传播函数中，您需要知道层 $l$ 中的激活函数（Sigmoid，tanh，ReLU等）是什么，在反向传播期间，相应的反向传播函数也需要知道第 $l$ 层的激活函数是什么，因为梯度是根据它来计算的，正确吗？

正确

During backpropagation you need to know which activation was used in the forward propagation to be able to compute the correct derivative.

================================================================================

8、There are certain functions with the following properties:

(i) To compute the function using a shallow network circuit, you will need a large network (where we measure size by the number of logic gates in the network), but (ii) To compute it using a deep network circuit, you need only an exponentially smaller network. True/False?

(i) 使用浅层网络电路计算函数时，需要一个大网络（我们通过网络中的逻辑门数量来度量大小），但是（ii）使用深网络电路来计算它，只需要一个指数较小的网络。真/假？

正确。
因为层数少，节点数就会指数级增长。

================================================================================

9、Consider the following 2 hidden layer neural network: 考虑2个隐藏层的网络

Which of the following statements are True? (Check all that apply). 下面哪个正确

注意：这里只展示了正确的答案。

W^[1] will have shape (4, 4)
b^[1] will have shape (4, 1)
W^[2] will have shape (3, 4)
b^[2] will have shape (3, 1)
b^[3] will have shape (1, 1)
W^[3] will have shape (1, 3)

================================================================================

10、Whereas the previous question used a specific network, in the general case what is the dimension of W^[l], the weight matrix associated with layer l?

前面的问题使用了一个特定的网络，与层 $l$ 有关的权重矩阵在一般情况下，W[1]的维数是多少?

W^[l]的维度是 (n^[l],n^[l−1])。正确
W^[l]的维度是 (n^[l-1],n^[l])
W^[l]的维度是 (n^[l+1],n^[l])
W^[l]的维度是 (n^[l],n^[l+1])

2020-6-21 吴恩达-NN&DL-w4 深层NN(课后作业)

DAPPER 事务 TRANSACTION

2020-5-18 吳恩達-改善深層NN-w1 深度學習的實用層面(1.2 偏差-欠擬合(訓練集)/方差-過擬合(驗證集))

2020-6-6 吳恩達-NN&DL-w2 NN基礎(課後編程-Logistic Regression with a Neural Network mindset)

2020-6-10 吳恩達-NN&DL-w3 淺層NN(課後作業)

2020-6-21 吳恩達-NN&DL-w4 深層NN(課後作業)

2020-5-19 吳恩達-改善深層NN-w1 深度學習的實用層面(1.4 正則化-L2正則化/弗羅貝尼烏斯範數/權重衰減)

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結