TensorFlow隨筆記錄 (2): 簡單理解tf.nn.conv2d和tf.nn.max_pool方法的使用

方法定義:tf.nn.conv2d

tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=True, data_format="NHWC", dilations=[1,1,1,1], name=None)

參數:

  • input:  輸入的要做卷積的數據體,要求是一個`Tensor`
  • filter: 卷積核,要求也是一個`Tensor`, shape= [filter_height, filter_width, in_channels, out_channels], 其中 filter_height 爲卷積核高度,filter_weight 爲卷積核寬度,in_channel 是要做卷積的數據體的通道數 ,out_channel 是卷積核數量。
  • strides: 卷積步長(1-D tensor of length 4), shape=[1, strides, strides, 1],第一位和最後一位固定是1,

    默認設置strides [1, x, y, 1], The typical use sets the first (the batch) and last (the depth) stride to 1.

    batch = 1指在樣本上的步長爲1,depth = 1 指在通道上的步長爲1,即 strides[0] = strides[3] = 1;

    strides[1] = strides[2]="你設置的步長大小"。

  • padding: A `string` from: `"SAME", "VALID"`. "SAME" 表示使用0去填充邊界, "VALID"則不填充
  • data_format: An optional `string` from: `"NHWC", "NCHW"`. Defaults to `"NHWC"`.Specify the data format of the input and output data. With the default format "NHWC", the data is stored in the order of:  [batch, height, width, channels].
  • name: A name for the operation (optional).

具體實現

input shape: [batch, in_height, in_width, in_channels]

filter shape: [filter_height, filter_width, in_channels, out_channels]

計算過程:

1. 將filter展開成2-D matrix, shape: [filter_height*filter_width*in_channels, output_channels]

2. 從input tensor中提取patches構成一個virtual tensor, shape: [batch, out_height, out_width, filter_height*filter_width*in_channels]

3. 對於每一個patch,右乘上1中的filter matrix。即 [batch, out_height, out_width, filter_height*filter_width*in_channels] x [filter_height * filter_width * in_channels, output_channels], 那麼輸出的shape: [batch, out_height, out_width, output_channels]

【注:必須有 strides[0] = strides[3] = 1】。絕大多數情況下,水平的stride和豎直的stride一樣,即strides = [1, stride, stride, 1]。

輸出結果的shape計算:

在caffe中是這樣的:

out_height =floor(in_height+2*pad-filter_height)/stride+1; floor向下取整

out_width=floor(in_width+2*pad-filter_width)/stride+1

 

在TensorFlow中是這樣的:

"SAME" 類型的padding:

out_height = ceil(in_height / strides[1]); ceil向上取整

out_width = ceil(in_width / strides[2])

"VALID"類型的padding:

out_height = ceil((in_height - filter_height + 1) / striders[1])

out_width = ceil((in_width - filter_width + 1) / striders[2]

 驗證代碼

# -*- coding:utf-8 -*-

from __future__ import division
import tensorflow as tf
import numpy as np
import math
import pandas as pd

input_arr = np.zeros((12, 15), dtype=np.float32)
number = 0
for row_idx in range(input_arr.shape[0]):
    for col_idx in range(input_arr.shape[1]):
        input_arr[row_idx][col_idx] = number
        number +=1

number = 6
w_arr = np.zeros((2, 3), dtype=np.float32)
for row_idx in range(w_arr.shape[0]):
    for col_idx in range(w_arr.shape[1]):
        w_arr[row_idx][col_idx] = number
        number += 1

stride = [1, 1, 1, 1]

# 從卷積的定義【實際上不是卷積,而是cross-correlation】進行計算驗證---對VALID類型卷積進行
res_shape_h = int(math.ceil((input_arr.shape[0] - w_arr.shape[0] + 1) / stride[1]))
res_shape_w = int(math.ceil(input_arr.shape[1] - w_arr.shape[1] + 1) / stride[2])
validation_res = np.zeros(shape=(res_shape_h, res_shape_w), dtype=np.float32)

for row_idx in range(validation_res.shape[0]):
    for col_idx in range(validation_res.shape[1]):
        patch = input_arr[row_idx : row_idx+w_arr.shape[0], col_idx : col_idx+w_arr.shape[1]]
        # 這裏的 * 實際上代表的是點積,即對應元素位置相乘
        res = np.sum(patch * w_arr)
        validation_res[row_idx][col_idx] = res

print('result of convolution from its definition: validation_res')
print(validation_res)
pd.DataFrame(validation_res).to_csv('validation_res.csv', index = False, header=False)

# 從TensorFlow實現出發
input_arr = np.reshape(input_arr, [1, input_arr.shape[0], input_arr.shape[1], 1])
w_arr = np.reshape(w_arr, [w_arr.shape[0], w_arr.shape[1], 1, 1])

# 輸入Tensor, shape: [1, 12, 15, 1]
net_in = tf.constant(value=input_arr, dtype=tf.float32)

# filter, shape: [2, 3, 1, 1]
W = tf.constant(value=w_arr, dtype=tf.float32)

# TensorFlow卷積的計算結果
# valid卷積結果, shape: [1, 11, 13, 1]
result_conv_valid = tf.nn.conv2d(net_in, W, stride, 'VALID')
# same卷積結果, shape: [1, 12, 15, 1]
result_conv_smae = tf.nn.conv2d(net_in, W, stride, 'SAME')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    valid_conv_res, same_conv_res = sess.run([result_conv_valid, result_conv_smae])

print(valid_conv_res.shape)
valid_conv_res = np.reshape(valid_conv_res, [valid_conv_res.shape[1], valid_conv_res.shape[2]])
same_conv_res = np.reshape(same_conv_res, [same_conv_res.shape[1], same_conv_res.shape[2]])
print('TensorFlow con res: valid_conv_res')
print(valid_conv_res)
pd.DataFrame(valid_conv_res).to_csv('conv_res.csv', index=False, header=False)
pd.DataFrame(same_conv_res).to_csv('same_res.csv', index=False, header=False)

方法定義 :tf.nn.max_pool

tf.nn.max_pool(value, ksize, strides, padding, name=None)

目錄

方法定義:tf.nn.conv2d

參數:

具體實現

計算過程:

輸出結果的shape計算:

 驗證代碼

方法定義 :tf.nn.max_pool

tf.nn.max_pool(value, ksize, strides, padding, name=None)

參數   :


參數是四個,和卷積很類似:

 

第一個參數value:需要池化的輸入,一般池化層接在卷積層後面,所以輸入通常是feature map,依然是[batch, height, width, channels]這樣的shape

第二個參數ksize:池化窗口的大小,取一個四維向量,一般是[1, height, width, 1],因爲我們不想在batch和channels上做池化,所以這兩個維度設爲了1

第三個參數strides:和卷積類似,窗口在每一個維度上滑動的步長,一般也是[1, stride,stride, 1]

第四個參數padding:和卷積類似,可以取'VALID' 或者'SAME'

返回一個Tensor,類型不變,shape仍然是[batch, height, width, channels]這種形式

 

發佈了9 篇原創文章 · 獲贊 71 · 訪問量 6萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章