Caffe —— Deep learning in Practice

因工作交接需要，要將caffe使用方法及整體結構描述清楚。鑑於也有同學問過我相關內容，決定在本文中寫個簡單的tutorial, 方便大家參考。
本文簡單的講幾個事情：

Caffe能做什麼？
爲什麼選擇caffe?
環境
整體結構
Protocol buffer
訓練基本流程
Python中訓練
Debug

Caffe能做什麼？

定義網絡結構
訓練網絡
C++/CUDA 寫的結構
cmd/python/Matlab接口
CPU/GPU工作模式
給了一些參考模型&pretrain了的weights

爲什麼選擇caffe?

模塊化做的好
簡單：修改結構無需該代碼
開源：共同維護開源代碼

環境：

$ lsb_release -a
Distributor ID: Ubuntu
Description: Ubuntu 12.04.4 LTS
Release: 12.04
Codename: precise
$ cat /proc/version
Linux version 3.2.0-29-generic (buildd@allspice) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012
Vim + Taglist + Cscope

整體結構：

定義CAFFE爲caffe跟目錄，caffe的核心代碼都在$CAFFE/src/caffe 下，主要有以下部分：net, blob, layer, solver.

net.cpp:
net定義網絡，整個網絡中含有很多layers， net.cpp負責計算整個網絡在訓練中的forward, backward過程，即計算forward/backward 時各layer的gradient。
layers:
在$CAFFE/src/caffe/layers中的層，在protobuffer (.proto文件中定義message類型，.prototxt或.binaryproto文件中定義message的值) 中調用時包含屬性name， type（data/conv/pool…）， connection structure (input blobs and output blobs)，layer-specific parameters（如conv層的kernel大小）。定義一個layer需要定義其setup, forward 和backward過程。
blob.cpp:
net中的數據和求導結果通過4維的blob傳遞。一個layer有很多blobs， e.g,
- 對data，weight blob大小爲Number * Channels * Height * Width, 如256*3*224*224；
- 對conv層，weight blob大小爲 Output 節點數 * Input 節點數 * Height * Width，如AlexNet第一個conv層的blob大小爲96 x 3 x 11 x 11；
- 對inner product 層， weight blob大小爲 1 * 1 * Output節點數 * Input節點數； bias blob大小爲1 * 1 * 1 * Output節點數（ conv層和inner product層一樣，也有weight和bias，所以在網絡結構定義中我們會看到兩個blobs_lr，第一個是weights的，第二個是bias的。類似地，weight_decay也有兩個，一個是weight的，一個是bias的）；
  
  blob中，mutable_cpu/gpu_data() 和cpu/gpu_data()用來管理memory，cpu/gpu_diff()和 mutable_cpu/gpu_diff()用來計算求導結果。
slover.cpp:
結合loss，用gradient更新weights。主要函數：
Init(),
Solve(),
ComputeUpdateValue(),
Snapshot(), Restore(),//快照（拷貝）與恢復網絡state
Test()；

在solver.cpp中有3中solver，即3個類：AdaGradSolver, SGDSolver和NesterovSolver可供選擇。

關於loss，可以同時有多個loss，可以加regularization（L1/L2）；

Protocol buffer：

上面已經將過， protocol buffer在 .proto文件中定義message類型，.prototxt或.binaryproto文件中定義message的值；

Caffe
Caffe的所有message定義在$CAFFE/src/caffe/proto/caffe.proto中。
Experiment
在實驗中，主要用到兩個protocol buffer: solver的和model的，分別定義solver參數（學習率啥的）和model結構(網絡結構)。

技巧：
- 凍結一層不參與訓練：設置其blobs_lr=0
- 對於圖像，讀取數據儘量別用HDF5Layer（因爲只能存float32和float64，不能用uint8, 所以太費空間）

訓練基本流程：

數據處理
法一，轉換成caffe接受的格式：lmdb, leveldb, hdf5 / .mat, list of images, etc.；法二，自己寫數據讀取層(如https://github.com/tnarihi/tnarihi-caffe-helper/blob/master/python/caffe_helper/layers/data_layers.py)
定義網絡結構
配置Solver參數
訓練：如 caffe train -solver solver.prototxt -gpu 0

在python中訓練:
Document & Examples: https://github.com/BVLC/caffe/pull/1733

核心code：

$CAFFE/python/caffe/_caffe.cpp
定義Blob, Layer, Net, Solver類
$CAFFE/python/caffe/pycaffe.py
Net類的增強功能

Debug：

在Make.config中設置DEBUG := 1
在solver.prototxt中設置debug_info: true
在python/Matlab中察看forward & backward一輪後weights的變化

經典文獻：
[ DeCAF ] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. ICML, 2014.
[ R-CNN ] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR, 2014.
[ Zeiler-Fergus Visualizing] M. Zeiler and R. Fergus. visualizing and understanding convolutional networks. ECCV, 2014.
[ LeNet ] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. IEEE, 1998.
[ AlexNet ] A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. NIPS, 2012.
[ OverFeat ] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. Overfeat: Integrated recognition, localization and detection using convolutional networks. ICLR, 2014.
[ Image-Style (Transfer learning) ] S. Karayev, M. Trentacoste, H. Han, A. Agarwala, T. Darrell, A. Hertzmann, H. Winnemoeller. Recognizing Image Style. BMVC, 2014.
[ Karpathy14 ] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. CVPR, 2014.
[ Sutskever13 ] I. Sutskever. Training Recurrent Neural Networks. PhD thesis, University of Toronto, 2013.
[ Chopra05 ] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. CVPR, 2005.

Rachel-Zhang 博客專家

發佈了475 篇原創文章 · 獲贊 670 · 訪問量 1000萬+

私信關注

Caffe —— Deep learning in Practice

MySQL 核心模塊揭祕 | 18 期 | 鎖在內存里長什麼樣*

使用perf工具生成火焰圖

大齡程序員思考

響應式界面控件DevExtreme * 更強的數據分析和可視化功能

HttpSecurity 是如何組裝過濾器鏈的

數說海南——近6年海南各市縣人口簡單看

長序列中Transformers的高級注意力機制總結

WebStorm 創建 Vue 項目

CUDA系列學習（一）An Introduction to GPU and CUDA

CUDA系列學習（二）CUDA memory & variables - different memory and variable types

recompile with -fPIC /usr/local/lib/libboost_python.a: could not read symbols: Bad value

github不小心同步覆蓋了本地文件

有代價的單源最短路徑

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結