運行sklearn-theano的一個例子:benchmark (卷積的各種參數設置)

benchmark中的這個例子計算了各層Transform所需的時間,以及各層的輸出。運行並分析一下有助於理解各層網絡的結構。

How to run

python plot_overfeat_benchmark.py 

運行的結果是: (其實整個網絡只有12層(0-11)。)
各層網絡運行的時間,最費時間的是7-11層

分析

分析一下程序運行的輸出,測試使用了5張圖片,所以ouput.shape第一個維度總是5。

圖像庫asirra中有兩類:cat and dog. 每個圖片的大小是不定的。大概在500*300左右。
Load asirra圖片的時候,已經resize過了,X[count] = np.array(im.resize((231, 231)))。所以輸入CNN網絡的是231×231×3= 160083。

輸入的Shape:(5, 231, 231, 3)
Shape of layer 0 output () 第0層是normalize
(5, 160083) 5個samples,每個圖片231×231×3= 160083
(‘Time for layer 0’, 0.004625082015991211) 幾乎不需要時間
()
Shape of layer 1 output convolution with stride(4,4),filter shape:(96, 3, 11, 11)
(5, 301056) int((231-11)/4)+1=56, 301056=96*56*56,
(‘Time for layer 1’, 0.5500462055206299)
()
Shape of layer 2 output maxpool層, MaxPool((2, 2))
(5, 75264) 75264 = 301056/4 = 96×28×28*
(‘Time for layer 2’, 0.5582330226898193)
()
Shape of layer 3 output convolution with stride(1,1),filter shape:(256, 96, 5, 5)
(5, 147456) 24=(28-5)+1, 147456 = 256×24×24*
(‘Time for layer 3’, 2.358441114425659)
()
Shape of layer 4 output maxpool層, MaxPool((2, 2))
(5, 36864) 36864=147456/4
(‘Time for layer 4’, 2.3493311405181885)
()
Shape of layer 5 output
(5, 73728) (filter:((512, 256, 3, 3)), border=full, 然後去掉最外一圈,變成12×12) 512×12×12
(‘Time for layer 5’, 6.5379478931427)
()
Shape of layer 6 output
(5, 147456)
(‘Time for layer 6’, 23.23018503189087)
()
Shape of layer 7 output
(5, 147456)
(‘Time for layer 7’, 56.522364139556885)
()
Shape of layer 8 output
(5, 36864) 1024×6×6 , 這裏就已經是最小的Size了。
(‘Time for layer 8’, 56.03724789619446)
()
Shape of layer 9 output
(5, 3072) * filter(3072*6*6) 3072*1*1*
(‘Time for layer 9’, 57.496111154556274)
()
Shape of layer 10 output
(5, 4096)
(‘Time for layer 10’, 58.83445715904236)
()
Shape of layer 11 output
(5, 1000)
(‘Time for layer 11’, 59.27335500717163)
()
Shape of layer 12 output
(5, 160083)
(‘Time for layer 12’, 0.0028297901153564453)
()
Shape of layer 13 output
(5, 301056)
(‘Time for layer 13’, 0.5609118938446045)
()
Shape of layer 14 output
(5, 75264)
(‘Time for layer 14’, 0.5506980419158936)
()

Notes:

  1. subsample also called strides elsewhere. Convulotion with stride (stride = 2 )
    Convulotion  with  stride (stride = 2 )

  2. what is crop border.
    這裏寫圖片描述

但是sklearn-theano的cropping是指在卷積後的結果中切一部分出來:例如下面例子就是去掉整個圖像最外一圈。

c=[(1, -1), (1, -1)]  # cropping
self.expression_ = T.nnet.conv2d(self.input_,
            self.convolution_filter_,
            border_mode=self.border_mode,
            subsample=self.subsample_)[:, :, c[0][0]:c[0][1],
                                             c[1][0]:c[1][1]]
  1. option for nnet.conv2d : 第5層用的full。
    border_mode –
    valid’– only apply filter to complete patches of the image. Generates
    output of shape: image_shape - filter_shape + 1
    full’ – zero-pads image to multiple of filter shape to generate output
    of shape: image_shape + filter_shape - 1

(附)每一層的形狀:

0 Standardize(118.380948, 61.896913),
1 Convolution(ws[0], bs[0], subsample=(4, 4),
activation=’relu’),
2 MaxPool((2, 2)),

3 Convolution(ws[1], bs[1], activation=’relu’),
4 MaxPool((2, 2)),

5 Convolution(ws[2], bs[2],
activation=’relu’,
cropping=[(1, -1), (1, -1)],
border_mode=’full’),

6 Convolution(ws[3], bs[3],
activation=’relu’,
cropping=[(1, -1), (1, -1)],
border_mode=’full’),

7 Convolution(ws[4], bs[4],
activation=’relu’,
cropping=[(1, -1), (1, -1)],
border_mode=’full’),
8 MaxPool((2, 2)),

9 Convolution(ws[5], bs[5],
activation=’relu’),

10 Convolution(ws[6], bs[6],
activation=’relu’),

11 Convolution(ws[7], bs[7],
activation=’identity’)]

(附)每個Filter的形狀:

SMALL_NETWORK_FILTER_SHAPES = np.array([(96, 3, 11, 11),
(256, 96, 5, 5),
(512, 256, 3, 3),
(1024, 512, 3, 3),
(1024, 1024, 3, 3),
(3072, 1024, 6, 6),
(4096, 3072, 1, 1),
(1000, 4096, 1, 1)])

發佈了39 篇原創文章 · 獲贊 7 · 訪問量 14萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章