這篇文章簡單介紹下卷積神經網絡的發展歷程以及其中涉及到的重要論文。這些論文中,有些論文勝在思路奇特,有些勝在效果優秀,有些則勝在方法嚴謹,推理優美。希望有時間寫一些具體論文的解讀。
論文合集GitHub地址:CNN-Papers
卷積神經網絡的前身與早期發展:
- 1980年日本學者福島邦彥(Kunihiko Fukushima)提出的神經認知機模型(Neocognitron)
Fukushima K, Miyake S. Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition[M].Competition and cooperation in neural nets. Springer, Berlin, Heidelberg, 1982: 267-285. - 1989年Yann LeCun提出第一個真正意義上的CNN:LeNet 1989
LeCun Y, Boser B, Denker J S, et al. Backpropagation applied to handwritten zip code recognition[J]. Neural computation, 1989, 1(4): 541-551. - 1998年Yann LeCun在其博士論文中詳細介紹了LeNet(又稱LeNet-5),影響力巨大
LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
2012年以來卷積神經網絡迎來迅猛發展階段:
- 2012年ILSVRC(分類)冠軍:AlexNet,掀起深度學習計算機視覺狂潮
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C].Advances in neural information processing systems. 2012: 1097-1105. - 2013年ILSVRC(分類)冠軍:ZFNet
Zeiler M D, Fergus R. Visualizing and understanding convolutional networks[C].European conference on computer vision. Springer, Cham, 2014: 818-833. - 2014年ILSVRC(分類)冠軍:GoogLeNet,提出Inception結構
Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]. Cvpr, 2015. - 2014年ILSVRC(分類)亞軍:VGGNet,亮點是對網絡深度的研究
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014. - 2015年ILSVRC(分類)冠軍:ResNet,提出Residual結構
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C].Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
卷積神經網絡結合改進與瓶頸階段:
合理結合Inception結構與Residual結構的卷積神經網絡已經能夠達到令人滿意的特徵提取效果,但是在解釋性上卻沒有更深一步進展。
- 2016年Google團隊結合了Inception結構與Residual 結構,提出Inception-Residual Net
Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, inception-resnet and the impact of residual connections on learning[C].AAAI. 2017, 4: 12. - 2016年何凱明提出新的ResNet的想法:Identity Mapping
He K, Zhang X, Ren S, et al. Identity mappings in deep residual networks[C].European Conference on Computer Vision. Springer, Cham, 2016: 630-645. - 2017年DenseNet
Huang G, Liu Z, Weinberger K Q, et al. Densely connected convolutional networks[C].Proceedings of the IEEE conference on computer vision and pattern recognition. 2017, 1(2): 3. - 2017年ILSVRC(分類)冠軍:SENet(Squeeze-and-Excitation Networks),提出了Squeeze-and-Excitation Block,網絡結合SE Block和Res Block
Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.
輕量級卷積神經網絡發展階段:
2016年以來,卷積神經網絡開始往輕量化發展,爲視覺深度學習模型在移動設備上的應用提供條件。
- 2016年MobileNet
Howard A G, Zhu M, Chen B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861, 2017. - 2016年ShuffleNet
Zhang X, Zhou X, Lin M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices[J]. arXiv preprint arXiv:1707.01083, 2017. - 2016年Xception【注:Xception目標並不是使卷積神經網絡輕量化,而是在不增加網絡複雜度的情況下提升性能,但其中使用的depthwise convolution思想是MobileNet等輕量級卷積神經網絡的關鍵,故也列在這裏】
Chollet F. Xception: Deep learning with depthwise separable convolutions[J]. arXiv preprint, 2017: 1610.02357. - 2016年ResNeXt【注:ResNeXt也是爲了在不增加網絡複雜度的情況下提升性能,列在此處的原因與Xception相同】
Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C].Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on. IEEE, 2017: 5987-5995. - 2018年MobileNet V2
Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 4510-4520. - 2018年ESPNet【ESPNet這篇文章不是純粹介紹CNN網絡的,而是爲語義分割任務設計的,但是其CNN網絡也是輕量的。】
Mehta S, Rastegari M, Caspi A, et al. Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 552-568. - 2018年ShuffleNet V2
Ma N, Zhang X, Zheng H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 116-131. - 2018年ESPNetV2
Mehta S, Rastegari M, Shapiro L, et al. ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network[J]. arXiv preprint arXiv:1811.11431, 2018.