tensorflow--CNN

目錄

  • 1 卷積函數
  • 2 池化函數
  • 3 分類函數
  • 4 一個示例
  • 5 存儲模型

API文檔:https://www.tensorflow.org/api_docs/

1 卷積函數

1 tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)

用4維的數據input和4維的卷積核filter做卷積操作,可以將通道混合在一起的函數,filter是[height, width, channels, multiplier],channels要同input的channels相同,然後將幾個channels混合在一起,重複multiplier次疊加在一起。

input_data = tf.Variable(np.random.rand(10,9,9,3), dtype=np.float32)
filter_data = tf.Variable(np.random.rand(2,2,3,2), dtype=np.float32)
y = tf.nn.conv2d(input_data, filter_data, strides =[1,1,1,1], padding = "SAME")
print(y.shape)
# (10, 9, 9, 2)

2 tf.nn.depthwise_conv2d(input, filter, strides, padding, name=None)

與conv2d不同的是不會將channels混合在一起。輸入的張量維度是[batch, in_height, in_width, channels],filter的維度是[filter_height, filter_width, in_channels, channel_multiplier]。函數有channel_multiplier個不同的卷積核,獨立的用到張量上,疊加在一起,最後通道總數是channel_multiplier*in_channels。

input_data = tf.Variable(np.random.rand(10,9,9,3), dtype=np.float32)
filter_data = tf.Variable(np.random.rand(2,2,3,2), dtype=np.float32)
y = tf.nn.depthwise_conv2d(input_data, filter_data, strides =[1,1,1,1], padding = "SAME")
print(y.shape)
# (10, 9, 9, 6)

3 tf.nn.separable_conv2d(input, depthwise_filter, pointwise_filter, strides, padding, name=None)

利用幾個分離的卷積核去做卷積。首先用depthwise_filter做卷積,效果與depthwise_conv2d相同,然後用1x1的卷積核pointwise_filter去做卷積。pointwies_filter [1,1, multiplier*channels, out_channels]

input_data = tf.Variable(np.random.rand(10,9,9,3), dtype=np.float32)
depthwise_filter = tf.Variable(np.random.rand(2,2,3,2), dtype=np.float32)
pointwise_filter = tf.Variable(np.random.rand(1,1,6,12), dtype=np.float32)
y = tf.nn.separable_conv2d(input_data, depthwise_filter, pointwise_filter, strides =[1,1,1,1], padding = "SAME")
print(y.shape)
# (10, 9, 9, 12)

4 更多卷積函數

  • tf.nn.atrous_conv2d:計算Atrous卷積
  • tf.nn.conv2d_transpose:反捲積,其實是卷積的轉置
  • tf.nn.conv1d:一維的
  • tf.nn.conv3d:三維的

2 池化函數

1 tf.nn.avg_pool(value, ksize, strides, padding, name=None)

ksize 對應每一維的窗口大小,avg是把窗口中的所有數求一個平均數,stride對應每一維的移動幅度。
輸出每個維度的大小計算:

  • 選擇 SAME 的模式:
    在各個方向的邊緣會補0,也就是如果stride移動都爲1的話會是原來的大小,計算公式:(value.shape + 1) / stride,如果不能整除,最後的數據會丟棄不選。
  • 選擇 VALID 的模式:
    邊緣不會補0,公式是:(value.shape - ksize + 1) / stride
input_data = tf.Variable(np.random.rand(10,5,5,3), dtype=np.float32)
filter_data = tf.Variable(np.random.rand(2,2,3,5), dtype=np.float32)
y = tf.nn.conv2d(input_data, filter_data, strides =[1,1,1,1], padding = "SAME")
print(y.shape)
# (10, 5, 5, 5)
output = tf.nn.avg_pool(value=y, ksize=[1,2,2,1],strides=[1,2,2,1],padding="VALID")
print(output.shape)
# (10, 2, 2, 5)
output = tf.nn.avg_pool(value=y, ksize=[1,2,2,1],strides=[1,2,2,1],padding="SAME")
print(output.shape)
# (10, 3, 3, 5)

2 tf.nn.max_pool(value, ksize, strides, padding, name=None)

與avg一樣,唯一不同是每次從整個窗口中選出最大的作爲代表而不是求平均數。

3 分類函數

1 tf.nn.sigmoid_cross_entropy_with_logits(logits, targets, name=None)

輸入:logits:[batch_size, num_classes], targets:[batch_size, size].logits,使用此函數時候最後一層不需要進行sigmoid操作,函數內部包含了這個操作。

2 tf.nn.softmax(logits, name=None)

softmax = exp(logits) / reduce_sum(exp(logits), dim)

3 tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None)

輸入:logits和labels 都是[batch_size, num_classes],loss中保存batch中每個樣本的交叉熵。

4 簡單示例

在mnist數據集上與線性迴歸作比較,線性迴歸例子:

# 下載mnist地址:http://yann.lecun.com/exdb/mnist/
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf

# 存數據的路徑
dir_s = "/Users/xiayongtao/Downloads/tensorflow_note/CNN/mnist"
mnist = input_data.read_data_sets(dir_s, one_hot = True)

# 構建迴歸模型
# 定義模型中的節點
x = tf.placeholder(tf.float32, [None, 784])
y_ = tf.placeholder(tf.float32, [None, 10])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.matmul(x, W) + b

coss_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y, labels=y_))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(coss_entropy)

sess = tf.InteractiveSession()
# 使用tf.InteractiveSession()來構建會話的時候,
# 我們可以先構建一個session然後再定義操作(operation)
# 如果我們使用tf.Session()來構建會話我們需要在會話構建之前定義好全部的操作(operation)然後再構建會話。
tf.global_variables_initializer().run()

# train
for _ in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict = {x:batch_xs, y_:batch_ys})

# 評價
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
# argmax:返回y最大值的座標,0和1
# 0 是 列:各個數組相同位置的數最大值
# 1 是 行:每個數組中最大的數
# 該數據是(100, 10)應該用0
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict = {x:mnist.test.images, y_:mnist.test.labels}))

# 準確率:0.9202

卷積例子:

# 下載mnist地址:http://yann.lecun.com/exdb/mnist/
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf

# 存數據的路徑
dir_s = "/Users/xiayongtao/Downloads/tensorflow_note/CNN/mnist"
mnist = input_data.read_data_sets(dir_s, one_hot = True)

train_X, train_Y, test_X, test_Y = mnist.train.images, mnist.train.labels, mnist.test.images, mnist.test.labels

print(train_X.shape)
# (55000, 784)
# 要做卷積,要換成(batch_size, height, width, channels)的形式,mnist黑白圖channels爲1,RGB圖爲3
train_X = train_X.reshape(-1, 28, 28, 1)
test_X = test_X.reshape(-1, 28, 28, 1)

# 權重初始化函數:
def init_weights(shape):
    return tf.Variable(tf.random_normal(shape, stddev=0.01))

# 定義卷積核:
# 定義一個3*3的卷積和,1與channels一致,疊加32層,輸出維度(channels)爲32
w1 = init_weights([3, 3, 1, 32]) 
w2 = init_weights([3, 3, 32, 64])
w3 = init_weights([3, 3, 64, 128])
# 定義全鏈接層:
# 需要計算輸出維度,初始值爲28,SAME模式,計算公式:(value + 1) / stride
# strides取2,執行了3此則爲4,將其全連接到一個全連接分類層上
w4 = init_weights([128*4*4, 625])
# 連接到分類層
w5 = init_weights([625, 10])

# 定義卷積層函數
def conv_layer(in_data, filter_data, pooling_size, pooling_stride, p_keep_conv):
    conv_in = tf.nn.conv2d(in_data, filter_data, strides=[1,1,1,1], padding="SAME")
    activate_conv = tf.nn.relu(conv_in)
    # 深度神經網絡一般用relu,sigmoid容易梯度消失
    pooling = tf.nn.max_pool(activate_conv, ksize=pooling_size, strides=pooling_stride, padding="SAME")
    out = tf.nn.dropout(pooling, p_keep_conv)
    return out

# 定義全連接層函數
def fc_layer(in_data, w, p_keep_fc):
    l = tf.matmul(in_data, w)
    l1 = tf.nn.relu(l)
    l2 = tf.nn.dropout(l1, p_keep_fc)
    return l2

X = tf.placeholder(tf.float32, [None, 28, 28, 1])
Y = tf.placeholder(tf.float32, [None, 10])

p_keep_conv = tf.placeholder("float")
p_keep_fc = tf.placeholder("float")
pooling_size = [1, 2, 2, 1]
pooling_stride = [1, 2, 2, 1]

# 定義model:
l1 = conv_layer(X, w1, pooling_size, pooling_stride, p_keep_conv)
l2 = conv_layer(l1, w2, pooling_size, pooling_stride, p_keep_conv)
l3 = conv_layer(l2, w3, pooling_size, pooling_stride, p_keep_conv)
l3 = tf.reshape(l3, [-1, 2048])
l4 = fc_layer(l3, w4, p_keep_fc)

output = tf.matmul(l4, w5)

# 定義損失函數
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=output,labels=Y))
# 優化方法
train_op = tf.train.RMSPropOptimizer(0.001, 0.9).minimize(cost)
predict_op = tf.argmax(output, 1)

# 訓練
batch_size = 128
test_size = 256

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    for i in range(10):
        training_batch = zip(range(0, len(train_X), batch_size),
                             range(batch_size, len(train_X)+1, batch_size))
        for start, end in training_batch:
            sess.run(train_op, feed_dict={X: train_X[start:end], Y: train_Y[start:end],
                                         p_keep_conv:0.8, p_keep_fc:0.5})
        test_indices = np.arange(len(test_X))
        np.random.shuffle(test_indices)
        test_indices = test_indices[0:test_size]

        print(i, np.mean(np.argmax(test_Y[test_indices], axis=1) ==sess.run(predict_op, feed_dict:{X:test_X[test_indices],                                                  p_keep_conv: 1,                                                    p_keep_fc:1})))

# 10個epoch後準確率是99%

5 保存和讀取模型

存儲數據:tf.train.Saver()
讀取數據:saver.restore()

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章