CS231n
Lecture 9: CNN Architectures
Case Studies
LeNet-5
AlexNet
ZFNet
VGGNet
smaller filters:Stack of three 3x3 conv (stride 1) layers has same effective receptive field as one 7x7 conv layer, but deeper, more non-linearities
GoogLeNet
- “Inception” module: a good local network topology (network within a network)
- No FC layers
- 12x less parameters than AlexNet
- 1x1 conv “bottleneck” layers 降低計算量
ResNet
degregation
Network in Network (NiN)
Identity Mappings in Deep Residual Networks
moves activation to residual mapping pathway, Creates a more direct path for propagating information throughout network
Wide Residual Networks
- residuals are the important factor, not depth
- 50-layer wide ResNet outperforms 152-layer original ResNet
- Increasing width instead of depth more computationally efficient (parallelizable)
ResNeXt
ResNet + Inception?
Deep Networks with Stochastic Depth
- reduce vanishing gradients and training time through short networks
- Randomly drop a subset of layers during each training pass
- Bypass with identity function
- Use full deep network at test time
FractalNet
有點像scattering network?
- key is transitioning effectively from shallow to deep and residual representations are not necessary
- Fractal architecture with both shallow and deep paths to output
- Trained with dropping out sub-paths
DenseNet
- each layer is connected to every other layer
- Alleviates vanishing gradient, strengthens feature propagation, encourages feature reuse
SqueezeNet
類似bottleneck的思想
- Fire modules consisting of a ‘squeeze’ layer with 1x1 filters feeding an ‘expand’ layer with 1x1 and 3x3 filters
- AlexNet level accuracy on ImageNet with 50x fewer parameters
- compress to 0.5Mb