總結caffe調參技巧

圖片加載有問題,直接百度雲下載本文.doc文檔:

鏈接:https://pan.baidu.com/s/1mhCxQsg 密碼:igfo



總的來說:

0.data:http://machinelearningmastery.com/improve-deep-learning-performance/


(0.


1.特徵選擇是門學問

)

 

 

learning rate很重要,可以設置小一點,但是訓練時間會加長


②batch size在自己機器允許的條件下,儘可能大一點(設置爲2的次方)

查看精度的結果圖表,從中獲取信息:


④Sec. 8: Ensemble(怎麼組合多個網絡結構,具體參考周志華的《機器學習》)


微調總體原理


⑥https://research.fb.com/wp-content/uploads/2017/06/imagenet1kin1h5.pdf?(gpu分佈式訓練) :

Linear ScalingRule: When the minibatch size is multiplied by k, multiply the learning rate byk.(All other hyper-parameters (weight decay, momentum, etc.) are keptunchanged)

 

 

 

具體如下:

一.Caffe結構參數調試技巧

 

0.數據準備的時候:

①X= X.astype(np.float32)

        X, y =shuffle(X, y, random_state=42)  # shuffle train data

        y =y.astype(np.float32)

 

②歸一化等:

 

 

1. http://blog.csdn.net/u011762313/article/details/47399981

Solver初始化中(Caffe提供了3Solver方法Stochastic Gradient DescentSGD,隨機梯度下降),Adaptive GradientADAGRAD,自適應梯度下降)和Nesterov’sAccelerated GradientNESTEROVNesterov提出的加速梯度下降)。

 

SGD默認的較好的參數構造:

base_lr: 0.01    # 開始學習速率爲:α = 0.01

lr_policy: "step"#學習策略: stepsize次迭代之後,將α乘以gamma

gamma: 0.1       # 學習速率變化因子

stepsize: 100000 # 100K次迭代,下降學習速率

max_iter: 350000 # 訓練的最大迭代次數

momentum: 0.9    #momentum爲:μ = 0.01

 

其他兩種方法效果也好。

 

 

2, https://corpocrat.com/2015/02/24/facial-keypoints-extraction-using-deep-learning-with-caffe/

①We specify RELU (to allowvalues > 0, plus faster converging) ,Dropout layerto prevent overfitting.(Dropout層防止過擬合

 

http://lamda.nju.edu.cn/weixs/project/CNNTricks/CNNTricks.html:)

In conclusion,three types of ReLU variants all consistently outperform the original ReLU inthese three data sets. And PReLU and RReLU seem better choices. Moreover, Heet al. also reported similarconclusions in [4].

②全連接層加入(xavier作用:默認將Blob係數x初始化爲滿足xU(−a,+a)的均勻分佈):

weight_filler{

      type:"xavier"

    }

    bias_filler{

      type:"constant"

      value: 0.1

}

 

layer {

  name:"relu22"

  type:"ReLU"

  bottom:"fc6"

  top:"fc6"

}

Layer層具體參數的初始方式http://blog.csdn.net/wenlin33/article/details/53378613

 

3.看圖表等結果:

①訓練的時候  最好可以可視化一些featuremap

②訓練測試的log保存下來成圖表,根據圖表顯示的精度看參數的效果!

 

 

4.小技巧:http://danielnouri.org/notes/2014/12/17/using-convolutional-neural-nets-to-detect-facial-keypoints-tutorial/

 

①(An overfittingnet can generally be made to perform better by using more training data)

增加數據方法:通過圖片旋轉,翻轉等技巧


②加快網絡訓練速度:Rememberthat in our previous model, we initialized learning rate and momentum with astatic 0.01 and 0.9 respectively. Let's change that such that the learning ratedecreases linearly with the number of epochs, while we let the momentumincrease.

③加載預訓練pre-trained model的權重,來加快本次的訓練速度

④put the BatchNorm layer immediately afterfully connected
layers (or convolutional layers), and before activation

 

 

 

二 微調一些深度網絡

①http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html

https://zhuanlan.zhihu.com/p/22624331

尤其是當我們的數據相對較少時,就更適合選擇這種辦法。既可以有效利用深度神經網絡強大的泛化能力,又可以免去設計複雜的模型以及耗時良久的訓練。目前最強大的模型是ResNet,很多視覺任務都可以通過fine-tuning ResNet得到非常好的performance!

③   微調總體原理:

 

 

 

很好的博客

http://machinelearningmastery.com/improve-deep-learning-performance/

http://lamda.nju.edu.cn/weixs/project/CNNTricks/CNNTricks.html

https://github.com/hwdong/deep-learning/blob/master/deep%20learning%20papers.md 

 

 

 


發佈了48 篇原創文章 · 獲贊 39 · 訪問量 18萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章