caffe代碼閱讀：layer類和net類

這兩個類是caffe框架的基石，從名字上就看得出來，深度學習就是圍繞這兩個東西展開的，還是從代碼去看具體實現。

1.layer

layer類有五大種類，每個種類裏又有詳細按作用區分，但全是從一個基類Layer繼承過來，下面是具體的五類

Data Layers

Common Layers

Activation / Neuron Layers

Loss Layers

Vision Layers

基類layer裏的主要成員變量和函數，結合caffe的英文註釋看。

 protected:
  /** The protobuf that stores the layer parameters */
  LayerParameter layer_param_;
  /** The phase: TRAIN or TEST */
  Phase phase_;
  /** The vector that stores the learnable parameters as a set of blobs. */
  vector<shared_ptr<Blob<Dtype> > > blobs_;//blobs_[0]是weights，blobs_[1]是bias
  /** Vector indicating whether to compute the diff of each param blob. */
  vector<bool> param_propagate_down_;//是否根據反饋更新

  /** The vector that indicates whether each top blob has a non-zero weight in
   *  the objective function. */
  vector<Dtype> loss_;//這個應該是隻有最後的softmax之類的層纔會有，每個輸出對應的具體loss值

  /** Device context */
  DeviceContext *device_context_;

  /** @brief Using the CPU device, compute the layer output. */
  virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
                           const vector<Blob<Dtype>*>& top) = 0;
  /**
   * @brief Using the GPU device, compute the layer output.
   *        Fall back to Forward_cpu() if unavailable.
   */
  virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,
                           const vector<Blob<Dtype>*>& top) {
    // LOG(WARNING) << "Using CPU code as backup.";
    Forward_cpu(bottom, top);
  }

  /**
   * @brief Using the CPU device, compute the gradients for any parameters and
   *        for the bottom blobs if propagate_down is true.
   */
  virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
                            const vector<bool>& propagate_down,
                            const vector<Blob<Dtype>*>& bottom) = 0;
  /**
   * @brief Using the GPU device, compute the gradients for any parameters and
   *        for the bottom blobs if propagate_down is true.
   *        Fall back to Backward_cpu() if unavailable.
   */
  virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
                            const vector<bool>& propagate_down,
                            const vector<Blob<Dtype>*>& bottom) {
    // LOG(WARNING) << "Using CPU code as backup.";
    Backward_cpu(top, propagate_down, bottom);
  }

  /**
   * Called by the parent Layer's SetUp to check that the number of bottom
   * and top Blobs provided as input match the expected numbers specified by
   * the {ExactNum,Min,Max}{Bottom,Top}Blobs() functions.
   */
  virtual void CheckBlobCounts(const vector<Blob<Dtype>*>& bottom,
                               const vector<Blob<Dtype>*>& top) {
<span style="white-space:pre">	</span>//就是在檢查bottom和top的size是否正確，具體代碼就刪掉了
  }

  /**
   * Called by SetUp to initialize the weights associated with any top blobs in
   * the loss function. Store non-zero loss weights in the diff blob.
   */
  inline void SetLossWeights(const vector<Blob<Dtype>*>& top) {
    const int num_loss_weights = layer_param_.loss_weight_size();
    if (num_loss_weights) {
      CHECK_EQ(top.size(), num_loss_weights) << "loss_weight must be "
      "unspecified or specified once per top blob.";
      for (int top_id = 0; top_id < top.size(); ++top_id) {
        const Dtype loss_weight = layer_param_.loss_weight(top_id);
        if (loss_weight == Dtype(0)) {continue;}
        this->set_loss(top_id, loss_weight);
        const int count = top[top_id]->count();
        Dtype* loss_multiplier = top[top_id]->mutable_cpu_diff();
        caffe_set(count, loss_weight, loss_multiplier);
      }
    }
  }

SetUp函數的具體實現

  /**
   * @brief Implements common layer setup functionality.
   *
   * @param bottom the preshaped input blobs
   * @param top
   *     the allocated but unshaped output blobs, to be shaped by Reshape
   *
   * Checks that the number of bottom and top blobs is correct.
   * Calls LayerSetUp to do special layer setup for individual layer types,
   * followed by Reshape to set up sizes of top blobs and internal buffers.
   * Sets up the loss weight multiplier blobs for any non-zero loss weights.
   * This method may not be overridden.
   */

void SetUp(const vector<Blob<Dtype>*>& bottom,
             const vector<Blob<Dtype>*>& top) {
    CheckBlobCounts(bottom, top);//檢查size大小
    LayerSetUp(bottom, top);//調用子類實現的接口<pre name="code" class="cpp"><pre name="code" class="cpp">    //這裏多說一下，以BaseConvolutionLayer的LayerSetUp爲例，主要做了兩件事，（1）根據layer_param_設置pad size、kernel size等等   （2）如果blobs_沒初始化（size==0），那麼用filler去填充blobs_（參數在layer_param_裏）

<pre name="code" class="cpp">    Reshape(bottom,top);//根據已經分配好、shape好的bottom去shape分配好的top</span>

    SetLossWeights(top);//設置loss（僅爲不爲0的blob設置）
  }

2.net

基本參數，看註釋就行了

  /// @brief The network name
  string name_;
  /// @brief The phase: TRAIN or TEST
  Phase phase_;
  /// @brief Individual layers in the net
  vector<shared_ptr<Layer<Dtype> > > layers_;
  vector<string> layer_names_;
  map<string, int> layer_names_index_;
  vector<bool> layer_need_backward_;
  /// @brief the blobs storing intermediate results between the layer.
  vector<shared_ptr<Blob<Dtype> > > blobs_;
  vector<string> blob_names_;
  map<string, int> blob_names_index_;
  vector<bool> blob_need_backward_;
  /// bottom_vecs stores the vectors containing the input for each layer.
  /// They don't actually host the blobs (blobs_ does), so we simply store
  /// pointers.
  vector<vector<Blob<Dtype>*> > bottom_vecs_;
  vector<vector<int> > bottom_id_vecs_;
  vector<vector<bool> > bottom_need_backward_;
  /// top_vecs stores the vectors containing the output for each layer
  vector<vector<Blob<Dtype>*> > top_vecs_;
  vector<vector<int> > top_id_vecs_;

其它的函數最主要就是前向反饋，這不用寫了，搞懂Init（）就行了。

直接摘一段這函數的介紹，代碼不貼了（出自http://blog.csdn.net/u014114990/article/details/47415051）

Init(const NetParameter& in_param)
功能：初始化網絡
輸入：NetParameter& in_param
輸出：無
步驟：
<1> 調用InsertSplits()函數從in_param讀入新網絡到param
<2> 定義name_，blob_name_to_idx，available_blobs，num_layers
<3> param.input_size()返回輸入層blob的個數;
param.input(i)表示第i個blob的名字;
param.layers_size()返回網絡的層數。
<4> 對每一個輸入層的blob：

產生一塊和當前blob一樣大的空間 e.g. imput_dim=[12 55 66 39 20 24 48 64]表示第一個blob的四個維數爲 12 55 66 39，第二個爲 20 24 48 64 接着blob_pointer指向這塊空間
blob_pointer壓到blobs_中 vector<shared_ptr<Blob<Dtype>>> blobs_
blob_name壓到blob_names_中 vector<string> blob_names_
param.force_backward()壓到blob_need_backward_中
vector<bool> blob_need_backward_
i 壓到 net_input_blob_indices_中 net_input_blob_indices_ -> vector
blob_pointer.get() 壓到 net_input_blobs_中
注意與blobs_的區別
vector<shared_ptr<Blob<Dtype>>> blobs_
vector<Blob<Dtype>*> net_input_blobs_
shared_ptr類型的參數調用.get()則得到Blob*類型
map<string, int> blob_name_to_idx
初始化爲輸入層的每個blob的名字 set<string> available_blobs
計算所需內存 memory_used += blob_pointer->count()

<5> 存每一層的輸入blob指針 vector<vector<Blob<Dtype>*> > bottom_vecs_
存每一層輸入(bottom)的id vector<vector<int> > bottom_id_vecs_
存每一層輸出(top)的blob vector<vector<Blob<Dtype>*> > top_vecs_
用網絡的層數param.layers_size()去初始化上面四個變量
vector<vector<int> > top_id_vecs_
<6> 對第i層（很大的一個for循環）：

param.layers(i)返回的是關於第當前層的參數：
layer_param = param.layers(i)
把當前層的參數轉換爲shared_ptr<Layer<Dtype>>，並壓入到layers_中
把當前層的名字壓入到layer_names_：vector<string> layer_names_
判斷當前層是否需要反饋 need_backward = param.force_backward()
下面開始產生當前層：分爲處理bottom的blob和top的blob兩個步驟
對第j個bottom的blob：
- layer_param.bottom_size()存的是當前層的輸入blob數量
- layer_param.bottom(j)存的是第j個輸入blob的名字
- 讀取當前blob的id，其中blob_name_to_idx在輸入層初始化過了
  blob_name_to_idx[blob_name] = i
- 輸出當前blob的名字
- 存入第j個輸入blob的指針bottom_vecs_[i].push_back(blobs_[blob_id].get())
- 存入第j個輸入blob的id bottom_id_vecs_[i].push_back(blob_id)
- 更新need_backward
- 從available_blobs中刪除第j個blob的名字
對第j個top的blob：
- layer_param.top_size()存的是當前層的輸出blob數量
- layer_param.top(j)存的是第j個輸出blob的名字
- 判斷是否進行同址計算
- 輸出當前blob的名字
- 定義一塊新的blob空間，用blob_pointer指向這塊空間
- 把這個指針存入到blobs_中
- 把blob_name、force_backward、idx存入對應的容器中
- 向available_blobs插入當前blob的名字
- top_vecs_[i]對於第i層，插入當前blob的指針
- top_id_vecs_[i]對於第i層，插入當前blob的id
輸出當前層位於top的blob的信息
計算所需內存
判斷當前層i是否需要backward

<7> 所有名字在available_blobs中的blob爲當前層的輸出blob，
存入net_output_blobs_中
<8> 建立每個blob的name和index的對應關係map：blob_names_index_
<9> 建立每個層的name和index的對應關係map：layer_names_index_
<10> 調用GetLearningRateAndWeightDecay函數

3.solver

這個比較少，懶得在開一篇，寫在這吧

•Solver是整個訓練過程的解決方案，它的作用包括：

(1)創建訓練網絡、對網絡進行評估；

(2) 調用forward/backward迭代優化和更新參數；

(3) 定期評估測試網絡；

•Solver的每一次迭代執行：

(1) 調用網絡forward計算輸出和loss；

(2) 調用網絡backward計算梯度；

(3) 按照solver方法，採用漸變進行參數更新；

(4) 按照學習率、歷史和方法更新solver狀態。

caffe代碼閱讀：layer類和net類

機器學習常用算法（2）樸素貝葉斯

caffe代碼閱讀：layer類和net類

slurm入門

Going Deeper with Convolutions

CentOS 7.1 編譯opencv3.1+ffmpeg

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結