一、Solver到Net
SGDSolver的構造函數中主要執行了其父類Solver的構造函數,接着執行Solver::Init()
函數,在Init()中,有兩個函數值得注意:InitTrainNet()
和InitTestNets()
分別初始化訓練網絡和測試網絡。
(1). InitTrainNet
首先,ReadNetParamsFromTextFileOrDie(param_.NET(), &net_param)
把param_.Net()
(即examples/mnist/lenet_train_test.prototxt
)中的信息讀入net_param
。
其次,net_.reset(new Net<Dtype>(net_param))
重新構建網絡,調用Net的構造方法。
然後,在構造方法中執行Net::init()
,開始正式創建網絡。其主要代碼如下:
template <typename Dtype>
void Net<Dtype>::Init(const NetParameter& in_param) {
...
for (int layer_id = 0; layer_id < param.layer_size(); ++layer_id) {
// Setup layer.
const LayerParameter& layer_param = param.layer(layer_id);
// 在這裏創建網絡層
layers_.push_back(LayerRegistry<Dtype>::CreateLayer(layer_param));
// Figure out this layer's input and output
for (int bottom_id = 0; bottom_id < layer_param.bottom_size(); ++bottom_id) {
const int blob_id = AppendBottom(param, layer_id, bottom_id, &available_blobs, &blob_name_to_idx);
// If a blob needs backward, this layer should provide it.
need_backward |= blob_need_backward_[blob_id];
}
int num_top = layer_param.top_size();
for (int top_id = 0; top_id < num_top; ++top_id) {
AppendTop(param, layer_id, top_id, &available_blobs, &blob_name_to_idx);
}
...
// 在這裏配置網絡層
layers_[layer_id]->SetUp(bottom_vecs_[layer_id], top_vecs_[layer_id]);
...
}
for (int param_id = 0; param_id < num_param_blobs; ++param_id) {
AppendParam(param, layer_id, param_id);
}
...
}
說明:
- Lenet5在caffe中共有9層,即param.layer_size()==9,以上代碼每一次for循環創建一個網絡層
- 每層網絡是通過LayerRegistry::CreateLayer()創建的,類似與Solver的創建
- 14行Net::AppendBottom(),對於layer_id這層,從Net::blob_中取出blob放入該層對應的bottom_vecs_[layer_id]中
- 20行Net::AppendTop(),對於layer_id這層,創建blob(未包含數據)並放入Net::blob_中
- AppendParam中把每層網絡的訓練參數與網絡變量learnable_params_綁定,在lenet中,只有conv1,conv2,ip1,ip2四層有參數,每層分別有參數與偏置參數兩項參數,因而learnable_params_的size爲8.
(2). LayerRegistry::CreateLayer
工廠模式new出網絡層對象
(3). Layer::SetUp
void SetUp(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
InitMutex();
CheckBlobCounts(bottom, top);
//每層進行配置
LayerSetUp(bottom, top);
//修改輸出數據的維數(即top_blob的維數)等
//關注數據維數的應關注此函數
Reshape(bottom, top);
//設置損失權重
SetLossWeights(top);
}
其中,Reshape函數中通過compute_output_shape計算輸出blob的函數,
二、訓練網絡結構
序 | Layer | layer Type | Bottom Blob | Top Blob | Blob Shape |
---|---|---|---|---|---|
1 | minst | Data | data&&label | 64 1 28 28 (50176) && 64 (64) | |
2 | conv1 | Convolution | data | conv1 | 64 20 24 24 (737280) |
3 | pool1 | Pooling | conv1 | pool1 | 64 20 12 12 (184320) |
4 | conv2 | Convolution | pool1 | conv2 | 64 50 8 8 (204800) |
5 | pool2 | Pooling | conv2 | pool2 | 64 50 4 4 (51200) |
6 | ip1 | InnerProduct | pool2 | ip1 | 64 500 (32000) |
7 | relu1 | ReLU | ip1 | ip1(in-place) | 64 500 (32000) |
8 | ip2 | InnerProduct | ip1 | ip2 | 64 10 (640) |
9 | loss | SoftmaxWithLoss | ip2&&label | loss | (1) |
注:Top Blob Shape格式爲:BatchSize,ChannelSize,Height,Width(Total Count)
網絡結構如圖所示:
三、 第一層:Data Layer
(1). protobuff定義
訓練網絡的第一層protobuff定義爲:
layer {
name: "mnist"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
scale: 0.00390625
}
data_param {
source: "examples/mnist/mnist_train_lmdb"
batch_size: 64
backend: LMDB
}
}
(2). 函數LayerRegistry::CreateLayer
第1節中代碼第一次通過調用LayerRegistry::CreateLayer()創建了DataLayer類.
調用DataLayer()的構造函數,依次執行的順序爲其基類構造函數:Layer()、BaseDataLayer()、InternalThread()、BasePrefetchingDataLayer()、及DataLayer()。
其中,值得注意的是DataLayer(),在調用基類構造函數BasePrefetchingDataLayer()之後,對 DataReader reader_ 進行賦值,在該DataLayer對象中維護了一個DataReader對象reader_,其作用是添加讀取數據任務至,一個專門讀取數據庫(examples/mnist/mnist_train_lmdb
)的線程(若還不存在該線程,則創建該線程),此處一共取出了4*64個樣本至BlockingQueue<Datum*> DataReader::QueuePair::full_
。
template <typename Dtype>
DataLayer<Dtype>::DataLayer(const LayerParameter& param)
: BasePrefetchingDataLayer<Dtype>(param),
reader_(param) {
}
(3). 函數Layer::SetUp
此處按程序執行順序值得關注的有:
在DataLayer::DataLayerSetUp
中根據DataReader
中介紹的讀取的數據中取出一個樣本推測blob的形狀
BasePrefetchingDataLayer::LayerSetUp
如下代碼prefetch_[i].data_.mutable_cpu_data()
用到了涉及到gpu、cpu
間複製數據的問題.
// Before starting the prefetch thread, we make cpu_data and gpu_data
// calls so that the prefetch thread does not accidentally make simultaneous
// cudaMalloc calls when the main thread is running. In some GPUs this
// seems to cause failures if we do not so.
for (int i = 0; i < PREFETCH_COUNT; ++i) {
prefetch_[i].data_.mutable_cpu_data();
if (this->output_labels_) {
prefetch_[i].label_.mutable_cpu_data();
}
}
BasePrefetchingDataLayer
類繼承了InternalThread,BasePrefetchingDataLayer<Dtype>::LayerSetUp
中通過調用StartInternalThread()
開啓了一個新線程,從而執行BasePrefetchingDataLayer::InternalThreadEntry
BasePrefetchingDataLayer::InternalThreadEntry
關鍵代碼如下,其中load_batch(batch)
爲,從BlockingQueue<Datum*> DataReader::QueuePair::full_
(包含從數據庫讀出的數據)中讀取一個batch_size的數據到BlockingQueue<Batch<Dtype>*> BasePrefetchingDataLayer::prefetch_full_
中。由於該線程在prefetch_free_
爲空時將掛起等待(PREFETCH_COUNT=3
),prefetch_full_
中用完的Batch將放回prefetch_free_
中。該線程何時停止?
while (!must_stop()) {
Batch<Dtype>* batch = prefetch_free_.pop();
load_batch(batch);
#ifndef CPU_ONLY
if (Caffe::mode() == Caffe::GPU) {
batch->data_.data().get()->async_gpu_push(stream);
CUDA_CHECK(cudaStreamSynchronize(stream));
}
#endif
prefetch_full_.push(batch);
}
關於線程的總結:
- 此外一共涉及到兩個線程,分別爲都是繼承了InnerThread的BasePrefetchingDataLayer(DataLayer)類和DataReader中的Body類
- Body爲面向數據庫的線程,不斷從某個數據庫中讀出數據,存放至緩存爲隊列+
DataReader::QueuePair::BlockingQueue<Datum*>
,一般保存4*64個單位數據,單位爲Datum - BasePrefetchingDataLayer爲面向網絡的線程,從Body的緩存中不斷讀取數據。
BasePrefetchingDataLayer
的緩存爲隊列BlockingQueue<Batch*>
,一般存放3個單位的數據,單位爲Batch
static const int PREFETCH_COUNT = 3;
Batch<Dtype> prefetch_[PREFETCH_COUNT];
BlockingQueue<Batch<Dtype>*> prefetch_free_;
BlockingQueue<Batch<Dtype>*> prefetch_full_;
template <typename Dtype>
BasePrefetchingDataLayer<Dtype>::BasePrefetchingDataLayer(
const LayerParameter& param)
: BaseDataLayer<Dtype>(param),
prefetch_free_(), prefetch_full_() {
for (int i = 0; i < PREFETCH_COUNT; ++i) {
prefetch_free_.push(&prefetch_[i]);
}
}
prefetch_full_與prefetch_free_中的元素由prefetch_提供
四、第二層:Convolution Layer
(1). protobuff定義
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
不像DataLayer 直接執行的是構造函數,此時執行的是GetConvolutuionLayer()
,然後調用ConvolutionLayer()
.
(2). LayerSetUp
在Layer::SetUp
中,調用了ConvolutionLayer
的基類BaseConvolutionLayer
的LayerSetUp及Reshape
函數,該類的主要成員變量如下:
/**
* @brief Abstract base class that factors out the BLAS code common to
* ConvolutionLayer and DeconvolutionLayer.
*/
template <typename Dtype>
class BaseConvolutionLayer : public Layer<Dtype> {
public:
explicit BaseConvolutionLayer(const LayerParameter& param)
: Layer<Dtype>(param) {}
virtual void LayerSetUp(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top);
virtual void Reshape(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top);
...
/// @brief The spatial dimensions of a filter kernel.
Blob<int> kernel_shape_;
/// @brief The spatial dimensions of the stride.
Blob<int> stride_;
/// @brief The spatial dimensions of the padding.
Blob<int> pad_;
/// @brief The spatial dimensions of the dilation.
Blob<int> dilation_;
/// @brief The spatial dimensions of the convolution input.
Blob<int> conv_input_shape_;
/// @brief The spatial dimensions of the col_buffer.
vector<int> col_buffer_shape_;
/// @brief The spatial dimensions of the output.
vector<int> output_shape_;
const vector<int>* bottom_shape_;
...
};
說明:
- LayerSetUp函數中,主要是初始化了kernel_shape_、stride_、pad_、dilation_以及初始化網絡參數,並存放與Layer::blobs_中。
- Reshape函數中,conv_input_shape_、bottom_shape_等
五、第三層:Pooling Layer
(2). protobuff定義
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
(2). Layer::SetUp
通過調用虛函數LayerSetUp及Reshape對以下成員變量進行初始化
/**
* @brief Pools the input image by taking the max, average, etc. within regions.
*
* TODO(dox): thorough documentation for Forward, Backward, and proto params.
*/
template <typename Dtype>
class PoolingLayer : public Layer<Dtype> {
....
int kernel_h_, kernel_w_;
int stride_h_, stride_w_;
int pad_h_, pad_w_;
int channels_;
int height_, width_;
int pooled_height_, pooled_width_;
bool global_pooling_;
Blob<Dtype> rand_idx_;
Blob<int> max_idx_;
};
六、 第四層、第五層
基本同第二層、第三層
七、 第六層:InnerProduct Layer
(1). protobuff定義
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
(2). Layer::SetUp
/**
* @brief Also known as a "fully-connected" layer, computes an inner product
* with a set of learned weights, and (optionally) adds biases.
*
* TODO(dox): thorough documentation for Forward, Backward, and proto params.
*/
template <typename Dtype>
class InnerProductLayer : public Layer<Dtype> {
...
int M_;
int K_;
int N_;
bool bias_term_;
Blob<Dtype> bias_multiplier_;
};
說明:
- N_爲輸出大小,即等於protobuff中定義的num_output
- K_爲輸入大小,對於該層Bottom Blob形狀爲(N, C, H, W),N爲batch_size,K_=C*H*W,M_=N。其中只有C、H、W跟內積相關
八、SoftmaxWithLoss Layer
(1). protobuff定義
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
九、SoftmaxWithLoss Layer
(1). protobuff定義
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
(2). Layer::SetUp
值得注意的是:
- 類SoftmaxWithLossLayer包含類SoftmaxLayer的實例
shared_ptr<Layer<Dtype> > softmax_layer_
- softmax_layer_在LayerSetUp中賦值。
- 此函數內調用
Layer::SetLossWeights
初始化了該層的Top Blob(loss) - 成員變量prob_作爲Softmaxlayer的top blob
- bottom blob[0]作爲softmaxlayer的bottom blob
- 所以經過softmaxlayer計算之後,得出64*10(每個樣本的每個類別上的概率)存放在prob_中
兩個類間的關係如下圖: