Deep learning：四十七(Stochastic Pooling簡單理解)

　　CNN中卷積完後有個步驟叫pooling, 在ICLR2013上，作者Zeiler提出了另一種pooling手段(最常見的就是mean-pooling和max-pooling)，叫stochastic pooling，在他的文章還給出了效果稍差點的probability weighted pooling方法。

　　stochastic pooling方法非常簡單，只需對feature map中的元素按照其概率值大小隨機選擇，即元素值大的被選中的概率也大。而不像max-pooling那樣，永遠只取那個最大值元素。

　　假設feature map中的pooling區域元素值如下：

　　3*3大小的，元素值和sum=0+1.1+2.5+0.9+2.0+1.0+0+1.5+1.0=10

　　方格中的元素同時除以sum後得到的矩陣元素爲：

　　每個元素值表示對應位置處值的概率，現在只需要按照該概率來隨機選一個，方法是：將其看作是9個變量的多項式分佈，然後對該多項式分佈採樣即可，theano中有直接的multinomial()來函數完成。當然也可以自己用01均勻分佈來採樣，將單位長度1按照那9個概率值分成9個區間（概率越大，覆蓋的區域越長，每個區間對應一個位置），然隨機生成一個數後看它落在哪個區間。

　　比如如果隨機採樣後的矩陣爲：

　　則這時候的poolng值爲1.5

　　使用stochastic pooling時(即test過程)，其推理過程也很簡單，對矩陣區域求加權平均即可。比如對上面的例子求值過程爲爲：

　 0*0+1.1*0.11+2.5*0.25+0.9*0.09+2.0*0.2+1.0*0.1+0*0+1.5*0.15+1.0*0.1=1.625 說明此時對小矩形pooling後的結果爲1.625.

　　在反向傳播求導時，只需保留前向傳播已經記錄被選中節點的位置的值，其它值都爲0,這和max-pooling的反向傳播非常類似。

　　Stochastic pooling優點：

　　方法簡單;

　　泛化能力更強;

　　可用於卷積層（文章中是與Dropout和DropConnect對比的，說是Dropout和DropConnect不太適合於卷積層. 不過個人感覺這沒什麼可比性，因爲它們在網絡中所處理的結構不同）;

　　至於爲什麼stochastic pooling效果好，作者說該方法也是模型平均的一種，沒怎麼看懂。

　　關於Stochastic Pooling的前向傳播過程和推理過程的代碼可參考（沒包括bp過程，所以代碼中pooling選擇的位置沒有保存下來）

　　源碼：pylearn2/stochastic_pool.py

"""
An implementation of stochastic max-pooling, based on

Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
Matthew D. Zeiler, Rob Fergus, ICLR 2013
"""

__authors__ = "Mehdi Mirza"
__copyright__ = "Copyright 2010-2012, Universite de Montreal"
__credits__ = ["Mehdi Mirza", "Ian Goodfellow"]
__license__ = "3-clause BSD"
__maintainer__ = "Mehdi Mirza"
__email__ = "mirzamom@iro"

import numpy
import theano
from theano import tensor
from theano.sandbox.rng_mrg import MRG_RandomStreams as RandomStreams
from theano.gof.op import get_debug_values

def stochastic_max_pool_bc01(bc01, pool_shape, pool_stride, image_shape, rng = None):
    """
    Stochastic max pooling for training as defined in:

    Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
    Matthew D. Zeiler, Rob Fergus

    bc01: minibatch in format (batch size, channels, rows, cols),
        IMPORTANT: All values should be poitivie
    pool_shape: shape of the pool region (rows, cols)
    pool_stride: strides between pooling regions (row stride, col stride)
    image_shape: avoid doing some of the arithmetic in theano
    rng: theano random stream
    """
    r, c = image_shape
    pr, pc = pool_shape
    rs, cs = pool_stride

    batch = bc01.shape[0] #總共batch的個數
    channel = bc01.shape[1] #通道個數

    if rng is None:
        rng = RandomStreams(2022)

    # Compute index in pooled space of last needed pool
    # (needed = each input pixel must appear in at least one pool)
    def last_pool(im_shp, p_shp, p_strd):
        rval = int(numpy.ceil(float(im_shp - p_shp) / p_strd))
        assert p_strd * rval + p_shp >= im_shp
        assert p_strd * (rval - 1) + p_shp < im_shp
        return rval #表示pool過程中需要移動的次數
        return T.dot(x, self._W)

    # Compute starting row of the last pool
    last_pool_r = last_pool(image_shape[0] ,pool_shape[0], pool_stride[0]) * pool_stride[0] #最後一個pool的起始位置
    # Compute number of rows needed in image for all indexes to work out
    required_r = last_pool_r + pr #滿足上面pool條件時所需要image的高度

    last_pool_c = last_pool(image_shape[1] ,pool_shape[1], pool_stride[1]) * pool_stride[1]
    required_c = last_pool_c + pc

    # final result shape
    res_r = int(numpy.floor(last_pool_r/rs)) + 1 #最後pool完成時圖片的shape
    res_c = int(numpy.floor(last_pool_c/cs)) + 1

    for bc01v in get_debug_values(bc01):
        assert not numpy.any(numpy.isinf(bc01v))
        assert bc01v.shape[2] == image_shape[0]
        assert bc01v.shape[3] == image_shape[1]

    # padding,如果不能整除移動，需要對原始圖片進行擴充
    padded = tensor.alloc(0.0, batch, channel, required_r, required_c)
    name = bc01.name
    if name is None:
        name = 'anon_bc01'
    bc01 = tensor.set_subtensor(padded[:,:, 0:r, 0:c], bc01)
    bc01.name = 'zero_padded_' + name

    # unraveling
    window = tensor.alloc(0.0, batch, channel, res_r, res_c, pr, pc)
    window.name = 'unravlled_winodows_' + name

    for row_within_pool in xrange(pool_shape[0]):
        row_stop = last_pool_r + row_within_pool + 1
        for col_within_pool in xrange(pool_shape[1]):
            col_stop = last_pool_c + col_within_pool + 1
            win_cell = bc01[:,:,row_within_pool:row_stop:rs, col_within_pool:col_stop:cs]
            window  =  tensor.set_subtensor(window[:,:,:,:, row_within_pool, col_within_pool], win_cell) #windows中裝的是所有的pooling數據塊

    # find the norm
    norm = window.sum(axis = [4, 5]) #求和當分母用 
    norm = tensor.switch(tensor.eq(norm, 0.0), 1.0, norm) #如果norm爲0,則將norm賦值爲1
    norm = window / norm.dimshuffle(0, 1, 2, 3, 'x', 'x') #除以norm得到每個位置的概率
    # get prob
    prob = rng.multinomial(pvals = norm.reshape((batch * channel * res_r * res_c, pr * pc)), dtype='float32') #multinomial()函數能夠按照pvals產生多個多項式分佈,元素值爲0或1
    # select
    res = (window * prob.reshape((batch, channel, res_r, res_c,  pr, pc))).max(axis=5).max(axis=4) #window和後面的矩陣相乘是點乘，即對應元素相乘，numpy矩陣符號
    res.name = 'pooled_' + name

    return tensor.cast(res, theano.config.floatX)

def weighted_max_pool_bc01(bc01, pool_shape, pool_stride, image_shape, rng = None):
    """
    This implements test time probability weighted pooling defined in:

    Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
    Matthew D. Zeiler, Rob Fergus

    bc01: minibatch in format (batch size, channels, rows, cols),
        IMPORTANT: All values should be poitivie
    pool_shape: shape of the pool region (rows, cols)
    pool_stride: strides between pooling regions (row stride, col stride)
    image_shape: avoid doing some of the arithmetic in theano
    """
    r, c = image_shape
    pr, pc = pool_shape
    rs, cs = pool_stride

    batch = bc01.shape[0]
    channel = bc01.shape[1]
    if rng is None: rng = RandomStreams(2022) # Compute index in pooled space of last needed pool # (needed = each input pixel must appear in at least one pool)
    def last_pool(im_shp, p_shp, p_strd):
        rval = int(numpy.ceil(float(im_shp - p_shp) / p_strd))
        assert p_strd * rval + p_shp >= im_shp
        assert p_strd * (rval - 1) + p_shp < im_shp
        return rval
    # Compute starting row of the last pool
    last_pool_r = last_pool(image_shape[0] ,pool_shape[0], pool_stride[0]) * pool_stride[0]
    # Compute number of rows needed in image for all indexes to work out
    required_r = last_pool_r + pr

    last_pool_c = last_pool(image_shape[1] ,pool_shape[1], pool_stride[1]) * pool_stride[1]
    required_c = last_pool_c + pc

    # final result shape
    res_r = int(numpy.floor(last_pool_r/rs)) + 1
    res_c = int(numpy.floor(last_pool_c/cs)) + 1

    for bc01v in get_debug_values(bc01):
        assert not numpy.any(numpy.isinf(bc01v))
        assert bc01v.shape[2] == image_shape[0]
        assert bc01v.shape[3] == image_shape[1]

    # padding
    padded = tensor.alloc(0.0, batch, channel, required_r, required_c)
    name = bc01.name
    if name is None:
        name = 'anon_bc01'
    bc01 = tensor.set_subtensor(padded[:,:, 0:r, 0:c], bc01)
    bc01.name = 'zero_padded_' + name

    # unraveling
    window = tensor.alloc(0.0, batch, channel, res_r, res_c, pr, pc)
    window.name = 'unravlled_winodows_' + name

    for row_within_pool in xrange(pool_shape[0]):
        row_stop = last_pool_r + row_within_pool + 1
        for col_within_pool in xrange(pool_shape[1]):
            col_stop = last_pool_c + col_within_pool + 1
            win_cell = bc01[:,:,row_within_pool:row_stop:rs, col_within_pool:col_stop:cs]
            window  =  tensor.set_subtensor(window[:,:,:,:, row_within_pool, col_within_pool], win_cell)

    # find the norm
    norm = window.sum(axis = [4, 5])
    norm = tensor.switch(tensor.eq(norm, 0.0), 1.0, norm)
    norm = window / norm.dimshuffle(0, 1, 2, 3, 'x', 'x')
    # average
    res = (window * norm).sum(axis=[4,5]) #前面的代碼幾乎和前向傳播代碼一樣，這裏只需加權求和即可
    res.name = 'pooled_' + name

    return res.reshape((batch, channel, res_r, res_c))

　　參考資料：

　　Stochastic Pooling for Regularization of Deep Convolutional Neural Networks. Matthew D. Zeiler, Rob Fergus.

pylearn2/stochastic_pool.py

作者：tornadomeet 出處：http://www.cnblogs.com/tornadomeet 歡迎轉載或分享，但請務必聲明文章出處。（新浪微博：tornadomeet,歡迎交流！）

池化pooling

Deep learning：四十七(Stochastic Pooling簡單理解)

工作中用到的腳本合集

微服務實踐Aspire項目發佈到遠程k8s集羣

通過f-string編寫簡潔高效的Python格式化輸出代碼

[轉帖]20個常用的Linux工具命令

[轉帖]PostgreSQL從小白到高手教程 - 第46講：poc-tpch測試

24-5-18 X

LibLinear（SVM包）

SIFT/SURF、haar特徵、廣義hough變換

CALTECH 101 RESULTS

Convolution and Pooling

HOG特徵

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結