Caffe Layer Library及理解

Convolution layer

# convolution
layer {
  name: "loss1/conv"
  type: "Convolution"
  bottom: "loss1/ave_pool"
  top: "loss1/conv"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 128
    kernel_size: 1
    stride:1 # default: stride=1
    pad: 1
    weight_filler {
      # xavier type
      type: "xavier"

      # gaussian type
      #type: "gaussian"
      #std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.2
    }
  }
}

參數:bottom（輸入），top(輸出)，num_output(通道數，卷積的個數)，Kernel_size(卷積核的大小)，stride(滑動步長)，pad(填充)，特徵輸出大小爲out_h = image_h + 2*pad_h – kernel_h）/stride_h+ 1，out_w = image_w +2*pad_w – kernel_w）/stride_w + 1。

Deconvolution

layer {
  name: "score2"
  type: "Deconvolution"
  bottom: "score"
  top: "score2"
  param {
    lr_mult: 1
  }
  convolution_param {
    num_output: 21
    kernel_size: 4
    stride: 2
    weight_filler: { type: "bilinear" }
  }
}

參數與卷積相同，out_h = (in_h - 1) * stride_h + kernel_h，out_w = (in_w - 1) * stride_w + kernel_w。

Dilation Convolution

layer {
  name: "conv5_3"
  type: "Convolution"
  bottom: "conv5_2"
  top: "conv5_3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 2
    dilation: 2 # Actually pad = dilation
  }
}

相對卷積層，多了參數dilation，dilation表示hole size，空洞卷積的特性。

Pooling

max pool

layer {
  name: "pool1_3x3_s2"
  type: "Pooling"
  bottom: "conv1_3_3x3"
  top: "pool1_3x3_s2"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
    pad: 1
  }
}

參數與卷積相近

ave pool

layer {
  name: "conv5_3_pool1"
  type: "Pooling"
  bottom: "conv5_3"
  top: "conv5_3_pool1"
  pooling_param {
    pool: AVE
    kernel_size: 60
    stride: 60
  }
}

Upsample

layer {
  name: "upsample4"
  type: "Upsample"
  bottom: "conv5_1_D"
  top: "pool4_D"
  bottom: "pool4_mask"
  upsample_param {
    scale: 2
    upsample_w: 60
    upsample_h: 45
  }
}

參數增加upsample_w和upsample_h，如果該層設定了該參數則輸出特徵圖大小爲upsample_w*upsample_h。

Eltwise

layer {
    bottom: "conv4_3"
    bottom: "res_conv4"
    top: "fusion_res_cov4"
    name: "fusion_res_cov4"
    type: "Eltwise"
    eltwise_param { operation: SUM } # PROD SUM MAX
}

將兩個特徵特徵加和，參數包含兩個輸入bottom和一個輸出top。

Concat

layer {
  name: "inception_4a/output"
  type: "Concat"
  bottom: "inception_4a/1x1"
  bottom: "inception_4a/3x3"
  bottom: "inception_4a/5x5"
  bottom: "inception_4a/pool_proj"
  top: "inception_4a/output"
}

在使模型變寬時，常需要把多個分支合併起來作爲後續層的輸入，參數包括多個輸入bottom（輸入至少爲2）和一個輸出。

InnerProduct

layer {
  name: "imagenet_fc"
  type: "InnerProduct"
  bottom: "fc7"
  top: "imagenet_fc"
  param {
    lr_mult: 1
    decay_mult: 250
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: ${NUM_LABELS}
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0.7
    }
  }
}

inner_product_layer也即全連接層，如下示意圖，每個輸出都連接到所有的輸入。

Dropout

layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}

參數dropout_ratio即爲一個神經元被保留的概率。

Batch Normaliztion

# BatchNorm2  
layer {
  name: "BatchNorm2" 
  #type: "LRN"
  type: "BatchNorm" include { phase: TRAIN}
  bottom: "Concat1"
  top: "BatchNorm2"
  param {
    lr_mult: 0
    decay_mult: 0
  }
  param {
    lr_mult: 0
    decay_mult: 0
  }
  param {
    lr_mult: 0
    decay_mult: 0
  }
  batch_norm_param {
    use_global_stats: false
  }
}

# BatchNorm
layer {
  name: "bn3"
  type: "BatchNorm"
  bottom: "conv3"
  top: "bn3"
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
  param {
    lr_mult: 0
  }
}

# BN
layer {
  name: "spp3_bn"
  type: "BN"
  bottom: "conv_spp_3_ave_pool"
  top: "spp3_bn"
  param {
    lr_mult: 1
    decay_mult: 0
  }
  param {
    lr_mult: 1
    decay_mult: 0
  }
  param {
    lr_mult: 0
    decay_mult: 0
  }
  param {
    lr_mult: 0
    decay_mult: 0
  }
  bn_param {
    slope_filler {
      type: "constant"
      value: 1
    }
    bias_filler {
      type: "constant"
      value: 0
    }
    frozen: true
    momentum: 0.95
  }
}

LRN

layer {
  name: "norm1"
  type: "LRN"
  bottom: "pool1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}

caffe中有一個LRN 層，全稱爲Local Response Normalization,即局部響應歸一化層。

該層參數有：

normal_region：選擇對相鄰通道間歸一化還是對通道內的空間區域歸一化，默認爲ACROSS_CHANNELS,

即通道間歸一化；

local_size:兩種表示，（1）：通道間歸一化是表示求和的通道數；（2）：通道內歸一化示表示歸一化操作的

區間的邊長；local_size的默認值爲5；

alpha:縮放因子，默認值爲1；

beta:指數項，默認值爲5；

局部響應歸一化層完成一種“臨近抑制”操作，對局部輸入區域歸一化。

在通道間歸一化模式中，局部區域範圍在相鄰通道間，但沒有空間上的擴張（即尺寸爲local_sizeX1X1);

在通道內歸一化模式中，局部區域範圍在當前通道內,有空間上的擴張（即1XlocalXloacl);

對輸入值都將除以;其中n爲局部尺寸大小：local_size;alpha和beta前面已經經定義。

求和將在當前處於中間的位置的局部區域內進行（如有必要將進行補零）；

Scale

layer {
  name: "slice"
  type: "Slice"
  bottom: "input"
  top: "output1"
  top: "output2"
  top: "output3"
  top: "output4"
  slice_param {
    axis: 1
    slice_point: 1
    slice_point: 3
    slice_point: 4
  }
}

這裏假設input的維度是N*5*H*W,tops輸出的維度分別爲N*1*H*W N*2*H*W N*1*H*W N*1*H*W 。
這裏需要注意的是，如果有slice_point，slice_point的個數一定要等於top的個數減一。
axis表示要進行分解的維度。
slice_point的作用是將axis按照slic_point 進行分解。

slice_point沒有設置的時候則對axis進行均勻分解。

label interpolation(差值)

Threshold

layer {
  name: "threshold"
  type: "Threshold"
  bottom: "soft_prob_s1"
  top: "threshold"
  threshold_param {  
    threshold: 1e-36
  }
}

SigmoidGateLayer

layer {
  name: "gate"
  type: "SigmoidGate"
  bottom: "soft_prob_s1"
  top: "gate"
  gate_param {  
    threshold: 0.5
  }
}

ReLU

layer {
  name: "relu1_1"
  type: "ReLU"
  bottom: "conv1_1"
  top: "conv1_1"
}

PReLU

layer {
  name: "relu6"
  bottom: "fc6"
  top: "relu6"
  type: "PReLU"
  prelu_param { 
    filler {
      type: "constant" 
      value: 0.3 
      } 
    channel_shared: false 
  }
}

interpolation

layer{
  bottom:"input"
  top:"output"
  name:"interp_layer"
  type:"Interp"
  interp_param{
     shrink_factor:4
     zoom_factor:3
     pad_beg:0
     pad_end:0
}

#讀入圖片的高度和寬度
 height_in_ = bottom[0]->height();
 width_in_ = bottom[0]->width();

#根據設定參數調整後的高度，pad_beg,pad_end只能設置成0及0以下。
 height_in_eff_ = height_in_ + pad_beg_ + pad_end_;
 width_in_eff_ = width_in_ + pad_beg_ + pad_end_;

#interp的順序是先縮小，再放大
 height_out_ = (height_in_eff_ - 1) / shrink_factor + 1;
 width_out_ = (width_in_eff_ - 1) / shrink_factor + 1;
 height_out_ = height_out_ + (height_out_ - 1) * (zoom_factor - 1);
 width_out_ = width_out_ + (width_out_ - 1) * (zoom_factor - 1);

真題思路是先縮小再放大。

參考：

https://blog.csdn.net/grief_of_the_nazgul/article/details/62043799