caffe常見優化器使用參數

caffe中solver不同優化器的一些使用方法(只記錄一些常用的)

下面是一些公用的參數

測試時需要前向傳播的次數,比如你有1000個數據,批處理大小爲10,那麼這個值就應該是100,這樣才能夠將所有的數據覆蓋
test_iter: 100
每多少次迭代進行一次測試.
test_interval: 500

weight_decay防止過擬合的參數,使用方式:
1 樣本越多,該值越小
2 模型參數越多,該值越大
weight_decay: 0.0005

rmsprop:
net: "examples/mnist/lenet_train_test.prototxt"
test_iter: 100
test_interval: 500
#The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.0
weight_decay: 0.0005
#The learning rate policy
lr_policy: "inv"
gamma: 0.0001
power: 0.75
display: 100
max_iter: 10000
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet_rmsprop"
solver_mode: GPU
type: "RMSProp"
rms_decay: 0.98


Adam:
net: "examples/mnist/lenet_train_test.prototxt"
test_iter: 100
test_interval: 500
#All parameters are from the cited paper above
base_lr: 0.001
momentum: 0.9
momentum2: 0.999
#since Adam dynamically changes the learning rate, we set the base learning
#rate to a fixed value
lr_policy: "fixed"
display: 100
#The maximum number of iterations
max_iter: 10000
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet"
type: "Adam"
solver_mode: GPU

multistep:
net: "examples/mnist/lenet_train_test.prototxt"
test_iter: 100
test_interval: 500
#The base learning rate, momentum and the weight decay of the network.
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
#The learning rate policy
lr_policy: "multistep"
gamma: 0.9
stepvalue: 5000
stepvalue: 7000
stepvalue: 8000
stepvalue: 9000
stepvalue: 9500
# Display every 100 iterations
display: 100
#The maximum number of iterations
max_iter: 10000
#snapshot intermediate results
snapshot: 5000
snapshot_prefix: "examples/mnist/lenet_multistep"
#solver mode: CPU or GPU
solver_mode: GPU


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章