relu和crelu使用

原創

2018-08-22 00:01

之前不瞭解crelu，隨便將網絡中的relu換成crelu, 然後調了半天的bug。
—–自己寫bug,自己調bug, 死循環ing ——

先看寫一段代碼：

import tensorflow as tf
import collections
slim = tf.contrib.slim

weights_initializer = tf.contrib.layers.xavier_initializer(uniform=True)
biases_initializer = tf.random_uniform_initializer(-0.01, 0.01)
activation_fn = tf.nn.relu
is_bn_training = True
reuse = False

with slim.arg_scope([slim.conv2d], padding='VALID', stride=[2, 1], weights_initializer=weights_initializer,
                                biases_initializer=biases_initializer, activation_fn=None, reuse=reuse):
    with slim.arg_scope([slim.batch_norm], decay=0.9997,
                            center=True, scale=True, epsilon=1e-5, activation_fn=activation_fn,
                            is_training=is_bn_training, reuse=reuse):

        # (batchsize, height, weight, channels)
        inputs = tf.placeholder(tf.float32, (None, None, 1, 1))
        trn_net = tf.pad(inputs, [[0, 0], [32, 32], [0, 0], [0, 0]])
        #(input, 輸出的feautres-map的數量, kernel, stride=[2,1])
        trn_net = slim.conv2d(trn_net, 16, [64, 1], scope='conv1')
        trn_net = slim.batch_norm(trn_net, scope='bnorm1')
        trn_net = slim.max_pool2d(trn_net, [8, 1], scope='pool1', stride=[8, 1])

        trn_net = tf.pad(trn_net, [[0, 0], [16, 16], [0, 0], [0, 0]])
        trn_net = slim.conv2d(trn_net, 32, [32, 1], scope='conv2')
        trn_net = slim.batch_norm(trn_net, scope='bnorm2')
        for var in tf.global_variables():
            print(var)

當激活函數是 relu 時打印的輸出：

<tf.Variable 'conv1/weights:0' shape=(64, 1, 1, 16) dtype=float32_ref>
<tf.Variable 'conv1/biases:0' shape=(16,) dtype=float32_ref>
<tf.Variable 'bnorm1/beta:0' shape=(16,) dtype=float32_ref>
<tf.Variable 'bnorm1/gamma:0' shape=(16,) dtype=float32_ref>
<tf.Variable 'bnorm1/moving_mean:0' shape=(16,) dtype=float32_ref>
<tf.Variable 'bnorm1/moving_variance:0' shape=(16,) dtype=float32_ref>
<tf.Variable 'conv2/weights:0' shape=(32, 1, 16, 32) dtype=float32_ref>  ＃shape (kernel_height,  kernel_weight,  pre-feature-map的數量，cur-feature-map的數量)
<tf.Variable 'conv2/biases:0' shape=(32,) dtype=float32_ref>
<tf.Variable 'bnorm2/beta:0' shape=(32,) dtype=float32_ref>
<tf.Variable 'bnorm2/gamma:0' shape=(32,) dtype=float32_ref>
<tf.Variable 'bnorm2/moving_mean:0' shape=(32,) dtype=float32_ref>
<tf.Variable 'bnorm2/moving_variance:0' shape=(32,) dtype=float32_ref>

當激活函數修改爲crelu時打印的輸出：

<tf.Variable 'conv1/weights:0' shape=(64, 1, 1, 16) dtype=float32_ref>
<tf.Variable 'conv1/biases:0' shape=(16,) dtype=float32_ref>
<tf.Variable 'bnorm1/beta:0' shape=(16,) dtype=float32_ref>
<tf.Variable 'bnorm1/gamma:0' shape=(16,) dtype=float32_ref>
<tf.Variable 'bnorm1/moving_mean:0' shape=(16,) dtype=float32_ref>
<tf.Variable 'bnorm1/moving_variance:0' shape=(16,) dtype=float32_ref>
<tf.Variable 'conv2/weights:0' shape=(32, 1, 32, 32) dtype=float32_ref>  ＃＃shape (kernel_height,  kernel_weight,  pre-feature-map的數量，cur-feature-map的數量)
<tf.Variable 'conv2/biases:0' shape=(32,) dtype=float32_ref>
<tf.Variable 'bnorm2/beta:0' shape=(32,) dtype=float32_ref>
<tf.Variable 'bnorm2/gamma:0' shape=(32,) dtype=float32_ref>
<tf.Variable 'bnorm2/moving_mean:0' shape=(32,) dtype=float32_ref>
<tf.Variable 'bnorm2/moving_variance:0' shape=(32,) dtype=float32_ref>

兩個參數量是不一致的，特別是在finetune網絡的時候，一定要注意使用，否則一直會報錯。
使用CReLU時，要有意識的將濾波器數量減半，否則會將輸入的feature-map的數量擴展爲兩倍，網絡參數將會增加。

下面具體看一下 crelu [1]的原理：
論文通過用AlexNet在Cifar數據集上進行試驗，通過觀測每一層中濾波器的分佈，低層的濾波器分佈存在對稱的現象，所以考慮在低層的網絡中採用 CRelu效果會提升很多。

CRelu的輸入是BN之後的16張feature-map, CRelu的輸入是 featrues = [features, -features]，所以輸入的feature-map的數量擴展成了一倍，所以這裏第一層BN之後的feature-map 的數目由 16 變成了 32，所以在第二層卷積的時候權重變成了 (32, 1, 32, 32).

Shang W, Sohn K, Almeida D, et al. Understanding and improving convolutional neural networks via concatenated rectified linear units[C]//International Conference on Machine Learning. 2016: 2217-2225.

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

relu和crelu使用

「Pygors跨平臺GUI」1：Pygors跨平臺GUI應用研究

[轉帖]

python列出centos7內存使用前50的進程信息

「Pygors跨平臺GUI」2：安裝MinGW-w64、MSYS2還是WSL2

一鍵自動化博客發佈工具,用過的人都說好(掘金篇)

Garnet：微軟官方基於.NET開源的高性能分佈式緩存存儲數據庫

Flink執行圖

Java響應式編程

評估統計算法在銀行僞造鈔票檢測中的價值

Dokcer部署Kafka集羣

OpenSmile 修改配置文件，抽取 IS13_ComParE 的 LLDs 特徵

mac下安裝pyaudio

TF中的tf.Variable 和 tf.placehold 的區別

numpy 數據的存取

python 讀取文件列表

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結