DeepLearning-L8-ResNet

ResNet

3. ResNet網絡結構

2015年，Kaiming He在《Deep Residual Learning for Image Recognition》提出ResNet，將網絡深度提升到152層，奪得 ILSVRC 2015的冠軍。

1. 深度網絡的問題

深層網絡能夠表示非常複雜的函數，在反向傳播過程中，梯度會逐漸消失（假如採用Sigmoid函數，對於幅度爲1的信號，每向後傳遞一層，梯度就衰減爲原來的0.25，層數越多，衰減越厲害），導致無法對前面網絡層的權重進行有效的調整。

隨着網絡層級的不斷增加，模型精度不斷得到提升，而當網絡層級增加到一定的數目以後，訓練精度和測試精度迅速下降，這說明當網絡變得很深以後，深度網絡就變得更加難以訓練了，因此並不是網絡越深越好。

2. ResNet基本模塊

ResNets使用"shortcut"或者叫"skip connection" ，使得梯度可以直接傳播到前幾層。

（1）The identity block

輸入輸出維度一致

Identity block. Skip connection "skips over" 2 layers.

Identity block. Skip connection "skips over" 3 layers.

First componentof main path:

The first CONV2D has $F_1$ filters of shape (1,1) and a stride of (1,1). Its padding is “valid” and its name should be conv_name_base + '2a'. Use 0 as the seed for the random initialization.
The first BatchNorm is normalizing the channels axis. Its name should be bn_name_base + '2a'.
Then apply the ReLU activation function. This has no name and no hyperparameters.

Second component of main path:

The second CONV2D has $F_2$ filters of shape $(f,f)$ and a stride of (1,1). Its padding is “same” and its name should be conv_name_base + '2b'. Use 0 as the seed for the random initialization.
The second BatchNorm is normalizing the channels axis. Its name should be bn_name_base + '2b'.
Then apply the ReLU activation function. This has no name and no hyperparameters.

Third component of main path:

The third CONV2D has $F_3$ filters of shape (1,1) and a stride of (1,1). Its padding is “valid” and its name should be conv_name_base + '2c'. Use 0 as the seed for the random initialization.
The third BatchNorm is normalizing the channels axis. Its name should be bn_name_base + '2c'. Note that there is no ReLU activation function in this component.

Final step:

The shortcut and the input are added together.
Then apply the ReLU activation function. This has no name and no hyperparameters.

（2）The convolutional block

輸入輸出維度不一致

First component of main path:

The first CONV2D has $F_1$ filters of shape (1,1) and a stride of (s,s). Its padding is “valid” and its name should be conv_name_base + '2a'.
The first BatchNorm is normalizing the channels axis. Its name should be bn_name_base + '2a'.
Then apply the ReLU activation function. This has no name and no hyperparameters.

Second component of main path:

The second CONV2D has $F_2$ filters of (f,f) and a stride of (1,1). Its padding is “same” and it’s name should be conv_name_base + '2b'.
The second BatchNorm is normalizing the channels axis. Its name should be bn_name_base + '2b'.
Then apply the ReLU activation function. This has no name and no hyperparameters.

Third component of main path:

The third CONV2D has $F_3$ filters of (1,1) and a stride of (1,1). Its padding is “valid” and it’s name should be conv_name_base + '2c'.
The third BatchNorm is normalizing the channels axis. Its name should be bn_name_base + '2c'. Note that there is no ReLU activation function in this component.

Shortcut path:

The CONV2D has $F_3$ filters of shape (1,1) and a stride of (s,s). Its padding is “valid” and its name should be conv_name_base + '1'.
The BatchNorm is normalizing the channels axis. Its name should be bn_name_base + '1'.

Final step:

The shortcut and the main path values are added together.
Then apply the ReLU activation function. This has no name and no hyperparameters.

3. ResNet網絡結構

ResNet-50 model

Zero-padding pads the input with a pad of (3,3)
Stage 1:
- The 2D Convolution has 64 filters of shape (7,7) and uses a stride of (2,2). Its name is “conv1”.
- BatchNorm is applied to the channels axis of the input.
- MaxPooling uses a (3,3) window and a (2,2) stride.
  -** Stage 2**:
- The convolutional block uses three set of filters of size [64,64,256], “f” is 3, “s” is 1 and the block is “a”.
- The 2 identity blocks use three set of filters of size [64,64,256], “f” is 3 and the blocks are “b” and “c”.
Stage 3:
- The convolutional block uses three set of filters of size [128,128,512], “f” is 3, “s” is 2 and the block is “a”.
- The 3 identity blocks use three set of filters of size [128,128,512], “f” is 3 and the blocks are “b”, “c” and “d”.
Stage 4:
- The convolutional block uses three set of filters of size [256, 256, 1024], “f” is 3, “s” is 2 and the block is “a”.
- The 5 identity blocks use three set of filters of size [256, 256, 1024], “f” is 3 and the blocks are “b”, “c”, “d”, “e” and “f”.
Stage 5:
- The convolutional block uses three set of filters of size [512, 512, 2048], “f” is 3, “s” is 2 and the block is “a”.
- The 2 identity blocks use three set of filters of size [512, 512, 2048], “f” is 3 and the blocks are “b” and “c”.
The 2D Average Pooling uses a window of shape (2,2) and its name is “avg_pool”.
The flatten doesn’t have any hyperparameters or name.
The Fully Connected (Dense) layer reduces its input to the number of classes using a softmax activation. Its name should be 'fc' + str(classes).

《Identity Mappings in Deep Residual Networks》提出了ResNet V2。通過研究 ResNet 殘差學習單元的傳播公式，發現前饋和反饋信號可以直接傳輸，因此“shortcut connection”（捷徑連接）的非線性激活函數（如ReLU）替換爲 Identity Mappings。同時，ResNet V2 在每一層中都使用了 Batch Normalization。這樣處理後，新的殘差學習單元比以前更容易訓練且泛化性更強。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

DeepLearning-L8-ResNet

ResNet

1. 深度網絡的問題

2. ResNet基本模塊

（1）The identity block

（2）The convolutional block

3. ResNet網絡結構

DAPPER 事務 TRANSACTION

Python錯誤： NameError

TensorFlow 實現VGG16圖像分類

DeepLearning-L7-GoogLeNet

DeepLearning-L5-AlexNet

DeepLearning-L4-LeNet5

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結