TensorFlow學習總結(六)卷積神經網絡—LeNet5

一、認知概念

https://blog.csdn.net/Raoodududu/article/details/82287099

二、 常用結構的tensorflow運用

1. 卷積層

'''通過tf.get_variable的方式創建過濾器的權重變量和偏置變量。'''
#權重變量:前兩個維度過濾器尺寸,第三個維度當前層的深度,第四個維度表示過濾器的深度
filter_weight = tf.get_cariable('weights', [5, 5, 3, 16], initializer = tf.truncated_normal_initializer(stddev=0.1))
#偏置變量:過濾器深度
biases = tf.get_variable('biases', [16], initializer=tf.constant_initializer(0.1))

'''卷積層前向傳播算法'''
#參數1:輸入的節點矩陣[batch, _, _, _];參數2:權重;
#參數3:不同維上的步長[1,_, _, 1],即只對長和寬卷積步長。
#參數4:SAME表示全0填充,VALID表示不添加。
conv = tf.nn.conv2d(input, filter_weight, strides=[1, 1, 1, 1], padding='SAME')
bias = tf.nn.bias_add(conv, biases)
actives_conv = tf.nn.relu(bias)

2.池化層

'''最大池化層的前向傳播'''
#參數1:當前層的節點矩陣;參數2:過濾器尺寸[1, _, _, 1]不可跨樣例和節點矩陣深度
#參數3,參數4:同上
pool = tf.nn.max_pool(actived_conv, ksize=[1, 5, 5, 1], strides=[1, 2, 2, 1],padding='SAME')

3.區別

    卷積層使用過濾器是橫跨整個深度,而池化層使用過濾器隻影響一個深度節點,所以池化層深度與輸入深度相同。 

三、經典卷積網絡模型

1.LeNet-5模型

 

優點:大幅提高準確率

缺點:無法很好處理類似ImageNet這樣比較大的圖像數據集

 

C1層(卷積層):輸入層32×32×1;濾波器尺寸5×5,深度6;不使用全0填充,步長1。

1. 對應特徵圖大小:特徵圖的大小28*28,這樣能防止輸入的連接掉到邊界之外(32-5+1=28)。

feature map邊長大小的具體計算參見:https://blog.csdn.net/Raoodududu/article/details/82287099

2. 參數個數:C1有156個可訓練參數 (每個濾波器5*5=25個unit參數和一個bias參數,一共6個濾波器,共(5*5*1+1)*6=156個參數)

3. 鏈接個數/FLOPS個數::(28*28)*(5*5+1)*6=122,304個。左邊是C1層每個feature map的神經元個數,右邊是濾波器在輸入層滑過的神經元個數,相乘即爲連接數。(每個鏈接對應1次計算,由wa+b可知,每個參數參與1次計算,所以1個單位的偏置b也算進去)

----------------------------------------

S2層(下采樣層):輸入層28×28×6;濾波器尺寸2×2;步長2。

1. 對應特徵圖大小:特徵圖大小14*14

2. 參數個數:S2層有 12個 (6*(1+1)=12) 可訓練參數。S2層 每個濾波器路過的4個鄰域 的4個輸入相加,乘以1個可訓練參數w,再加上1個可訓練偏置b(即一個濾波器對應兩個參數)。(對於子採樣層,每一個特徵映射圖的的可變參數需要考慮你使用的採樣方式而定,如文中的採樣方式,每一個特徵映射圖的可變參數數量爲2個,有的採樣方式不需要參數)

3. 鏈接個數/FLOPS個數:5880個連接,( 14*14* (2*2+1)*6 =5880) 。左邊是濾波器在C1層滑過的神經元個數,右邊是S2層每個feature map的神經元個數,相乘即爲連接數。

----------------------------------------

C3層(卷積層):輸入層14×14×6;濾波器尺寸5×5,深度16;不使用全0填充,步長1。

1. 對應特徵圖大小:特徵圖大小10*10(14-5+1=10)

2. 參數個數:(5×5×6+1)×16

3. 鏈接個數/FLOPS個數: 10*10*(5×5×6+1)×16=41600個連接。左邊是C3層特徵圖大小,右邊是濾波器滑過的S2層神經元個數。

------------------------------------------

S4層(下采樣層):輸入層10*10*16,;過濾器尺寸2*2,;全0填充,步長2。

1. 對應特徵圖大小:對應特徵圖大小5*5。

2. 參數個數:S4層有32個可訓練參數。(每個特徵圖1個因子w和1個偏置b,16*(1+1)=32)

3. 鏈接個數/FLOPS個數:5*5× (2*2+1)*16 =2000個連接。左邊是S4層神經元個數,右邊是濾波器在C3層滑過的神經元個數,相乘即爲連接數。

--------------------------------------------

C5層(卷積層或第一個全連接層):輸入層5*5*16;過濾器尺寸5*5,深度120,步長1。

1. 對應特徵圖大小:特徵圖的大小爲1*1。(5-5+1=1), 這構成了S4和C5之間的全連接。之所以仍將C5標示爲卷積層而非全相聯層,是因爲如果LeNet-5的輸入變大,而其他的保持不變,那麼此時特徵圖的維數就會比1*1大。

2. 參數個數:(16*5*5+1)*120=48120個。濾波器個數120*16個,所以w有120*16*5*5個,同組16個濾波器共用一個b,所以有120個b。

3. 鏈接個數/FLOPS個數:1*1×48120,左邊是C5層特徵圖大小(其實現在已經變成了單個神經元,大小1*1), 右邊是濾波器滑過的神經元個數,相乘即爲連接數,此處也即FLOPS個數。

--------------------------------------------

F6層(全連接層):輸入120節點,輸出84個節點。

1. 對應特徵圖大小:有84個單元(之所以選這個數字的原因來自於輸出層的設計),與C5層全相連。

2. 參數個數:有 84* (120*(1*1)+1)=10164 個可訓練參數。如同經典神經網絡,F6層計算輸入向量(120)和權重向量(1*1)之間的點積,再加上一個偏置(+1)。然後將其傳遞給sigmoid函數產生單元i的一個狀態。

3. 鏈接個數/FLOPS個數:1*1×10164,左邊是F6層特徵圖大小,右邊是濾波器在C5層滑過的神經元個數。

--------------------------------------------

輸出層:輸入84個節點,輸出節點10個

參數個數:(10+1)×84。

--------------------------------------------

2.LeNet-5代碼

(1)inference.py

import tensorflow as tf

# 1.配置參數
INPUT_NODE = 784
OUTPUT_NODE = 10

IMAGE_SIZE = 28
NUM_CHANNELS = 1
NUM_LABELS = 10
# 第一層卷積層的尺寸和深度
CONV1_DEEP = 32
CONV1_SIZE = 5
# 第二層卷積層的尺寸和深度
CONV2_DEEP = 64
CONV2_SIZE = 5
# 全連接層的節點個數
FC_SIZE = 512


# 2.定義前向傳播的過程
# 這裏添加了一個新的參數train,用於區分訓練過程和測試過程。
# 在這個程序中將用到dropout方法,dropout可以進一步提升模型可靠性並防止過擬合
# dropout過程只在訓練時使用
def inference(input_tensor, train, regularizer):
    # 定義的卷積層輸入爲28*28*1的原始MNIST圖片像素,使用全0填充後,輸出爲28*28*32
    with tf.variable_scope('layer1-conv1'):
        conv1_weights = tf.get_variable("weight", [CONV1_SIZE, CONV1_SIZE, NUM_CHANNELS, CONV1_DEEP], initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv1_biases = tf.get_variable("bias", [CONV1_DEEP], initializer=tf.constant_initializer(0.0))
        conv1 = tf.nn.conv2d(input_tensor, conv1_weights, strides=[1, 1, 1, 1], padding='SAME')
        relu1 = tf.nn.relu(tf.nn.bias_add(conv1, conv1_biases))

    # 實現第二層池化層的前向傳播過程。這一層輸入爲14*14*32
    with tf.name_scope('layer2-pool1'):
        pool1 = tf.nn.max_pool(relu1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

    # 實現第三層卷積層
    with tf.variable_scope('layer3-conv2'):
        conv2_weights = tf.get_variable("weight", [CONV2_SIZE, CONV2_SIZE, CONV1_DEEP, CONV2_DEEP], initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv2_biases = tf.get_variable("bias", [CONV2_DEEP], initializer=tf.constant_initializer(0.0))
        conv2 = tf.nn.conv2d(pool1, conv2_weights, strides=[1, 1, 1, 1], padding='SAME')
        relu2 = tf.nn.relu(tf.nn.bias_add(conv2, conv2_biases))

    # 實現第四層池化層
    with tf.name_scope('layer4-pool2'):
        pool2 = tf.nn.max_pool(relu2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

        # 將第四層池化層的輸出轉化爲第五層全連接的輸入格式。
        pool_shape = pool2.get_shape().as_list()                #將矩陣拉直成向量
        nodes = pool_shape[1] * pool_shape[2] * pool_shape[3]   #計算向量長度:即長×寬×深度
        reshaped = tf.reshape(pool2, [pool_shape[0], nodes])    #將第四層輸出變成一個batch向量;batch[0]爲一個batch中數據的個數。

    # 聲明第五層全連接層的變量並實現前向傳播過程
    with tf.variable_scope('layer5-fc1'):
        fc1_weights = tf.get_variable("weight", [nodes, FC_SIZE], initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer != None:                     #只有全連接層的權重需要正則化
            tf.add_to_collection('losses', regularizer(fc1_weights))
        fc1_biases = tf.get_variable("bias", [FC_SIZE], initializer=tf.constant_initializer(0.1))
        fc1 = tf.nn.relu(tf.matmul(reshaped, fc1_weights) + fc1_biases)
        if train: fc1 = tf.nn.dropout(fc1, 0.5)    #dropout避免過擬合:在訓練時隨即將部分節點輸出改爲0

    # 聲明第六層全連接層的變量並實現前向傳播過程
    with tf.variable_scope('layer6-fc2'):
        fc2_weights = tf.get_variable("weight", [FC_SIZE, NUM_CHANNELS], initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer != None:
            tf.add_to_collection('losses', regularizer(fc2_weights))
        fc2_biases = tf.get_variable("bias", [NUM_LABELS], initializer=tf.constant_initializer(0.1))
        logit = tf.matmul(fc1, fc2_weights) + fc2_biases
        return logit

(2)train.py


'''mnist_train.'''
import os
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import LeNet_5inference as inference
import numpy as np

BATCH_SIZE = 100
LEARNING_RATE_BASE = 0.05
LEARNING_RATE_DECAY = 0.99
REGULARIZATION_RATE = 0.0001
TRAINING_STEPS = 30000
MOVING_AVERAGE_DECAY = 0.99
#模型保存的路徑和文件名
MODEL_SAVE_PATH = "D:/python/pycharm/venv/LeNet"
MODEL_NAME = "model.ckpt"

def train(mnist):
    x = tf.placeholder(tf.float32, [BATCH_SIZE, inference.IMAGE_SIZE, inference.IMAGE_SIZE, inference.NUM_CHANNELS], name='x-input')
    y_ = tf.placeholder(tf.float32, [None, inference.OUTPUT_NODE], name='y-input')
    regularizer = tf.contrib.layers.l2_regularizer(REGULARIZATION_RATE)
    y = inference.inference(x, True, regularizer)                                     #前向傳播(訓練)
    global_step = tf.Variable(0, trainable=False)
    variable_averages = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step)
    variables_averages_op = variable_averages.apply(tf.trainable_variables())
    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=tf.argmax(y_, 1), logits=y)
    cross_entropy_mean = tf.reduce_mean(cross_entropy)
    loss = cross_entropy_mean + tf.add_n(tf.get_collection('losses'))
    learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE, global_step, mnist.train.num_examples / BATCH_SIZE, LEARNING_RATE_DECAY)
    train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
    with tf.control_dependencies([train_step, variables_averages_op]):
        train_op = tf.no_op(name='train')
    #初始化TensorFlow持久化類
    saver = tf.train.Saver()
    with tf.Session() as sess:
        tf.initialize_all_variables().run()
        for i in range(5000):
            xs, ys = mnist.train.next_batch(BATCH_SIZE)
            reshaped_xs = np.reshape(xs, (BATCH_SIZE, inference.IMAGE_SIZE, inference.IMAGE_SIZE, inference.NUM_CHANNELS))
            _, loss_value, step = sess.run([train_op, loss, global_step], feed_dict={x: reshaped_xs, y_: ys})
            if i % 1000 == 0:
                 print("After %d training step,loss on training""batch is %g" % (step, loss_value))  #輸出當前損失函數的大小
                 #保存當前模型
                 saver.save(sess, os.path.join(MODEL_SAVE_PATH, MODEL_NAME), global_step=global_step)

def main(argv=None):
    mnist = input_data.read_data_sets("D:/python/pycharm/venv/tmp/data", one_hot=True)
    train(mnist)

if __name__ == '__main__':
    tf.app.run()

(3)eval

import time
import math
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
import LeNet_5inference as LeNet5_infernece
import LeNet_5train as LeNet5_train

def evaluate(mnist):
    with tf.Graph().as_default() as g:
        # 定義輸出爲4維矩陣的placeholder
        x = tf.placeholder(tf.float32, [
            mnist.test.num_examples,
            #LeNet5_train.BATCH_SIZE,
            LeNet5_infernece.IMAGE_SIZE,
            LeNet5_infernece.IMAGE_SIZE,
            LeNet5_infernece.NUM_CHANNELS],
                           name='x-input')
        y_ = tf.placeholder(tf.float32, [None, LeNet5_infernece.OUTPUT_NODE], name='y-input')
        validate_feed = {x: mnist.test.images, y_: mnist.test.labels}
        global_step = tf.Variable(0, trainable=False)

        regularizer = tf.contrib.layers.l2_regularizer(LeNet5_train.REGULARIZATION_RATE)
        y = LeNet5_infernece.inference(x, False, regularizer)
        correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

        variable_averages = tf.train.ExponentialMovingAverage(LeNet5_train.MOVING_AVERAGE_DECAY)
        variables_to_restore = variable_averages.variables_to_restore()
        saver = tf.train.Saver(variables_to_restore)

        #n = math.ceil(mnist.test.num_examples / LeNet5_train.BATCH_SIZE)
        n = math.ceil(mnist.test.num_examples / mnist.test.num_examples)
        for i in range(n):
            with tf.Session() as sess:
                ckpt = tf.train.get_checkpoint_state(LeNet5_train.MODEL_SAVE_PATH)
                if ckpt and ckpt.model_checkpoint_path:
                    saver.restore(sess, ckpt.model_checkpoint_path)
                    global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]
                    xs, ys = mnist.test.next_batch(mnist.test.num_examples)
                    #xs, ys = mnist.test.next_batch(LeNet5_train.BATCH_SIZE)
                    reshaped_xs = np.reshape(xs, (
                        mnist.test.num_examples,
                        #LeNet5_train.BATCH_SIZE,
                        LeNet5_infernece.IMAGE_SIZE,
                        LeNet5_infernece.IMAGE_SIZE,
                        LeNet5_infernece.NUM_CHANNELS))
                    accuracy_score = sess.run(accuracy, feed_dict={x:reshaped_xs, y_:ys})
                    print("After %s training step(s), test accuracy = %g" % (global_step, accuracy_score))
                else:
                    print('No checkpoint file found')
                    return

# 主程序
def main(argv=None):
    mnist = input_data.read_data_sets("D:/python/pycharm/venv/tmp/data", one_hot=True)
    evaluate(mnist)

if __name__ == '__main__':
    main()

3.可視化

(1)def

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt  # plt 用於顯示圖片
import matplotlib.image as mpimg  # mpimg 用於讀取圖片
import numpy as np
import LeNet_5inference
import LeNet_5train
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("D:/python/pycharm/venv/tmp/data", one_hot=True)

img1 = mnist.train.images[1]            #1.讀入圖片2:  (784,)
label1 = mnist.train.labels[1]          #2.讀入標籤2:  [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
print(label1)
print("img_data shape = ", img1.shape)
img1.shape = [28, 28]                   #3.圖片轉化爲28×28矩陣
print(img1.shape)

'''顯示圖片'''
plt.imshow(img1)                #熱度圖
plt.show()
plt.imshow(img1, cmap='gray')   #灰度圖
plt.show()

'''多張圖顯示在一張圖上'''
plt.subplot(4,8,1)              #4行8列第一個
plt.imshow(img1, cmap='gray')
plt.axis('off')                 #關掉座標軸
plt.subplot(4,8,9)              #4行8列第九個
plt.imshow(img1, cmap='gray')
plt.axis('off')
plt.show()

'''用測試LeNet_5train'''
# 首先應該把 img1 轉爲正確的shape (None, 784)
X_img = mnist.train.images[2].reshape([-1, 784])
y_img = mnist.train.labels[2].reshape([-1, 10]) # 這個標籤只要維度一致就行了
result = LeNet_5inference.relu1.eval(feed_dict={X_: X_img, y_: y_img})

for _ in xrange(32):
    show_img = result[:,:,:,_]
    show_img.shape = [28, 28]
    plt.subplot(4, 8, _ + 1)
    plt.imshow(show_img, cmap='gray')
    plt.axis('off')
plt.show()

(2)調用

 # 代價函數曲線
    fig1, ax1 = plt.subplots(figsize=(10, 7))
    plt.plot(loss_value)
    ax1.set_xlabel('step')
    ax1.set_ylabel('loss_value')
    plt.title('Cross Loss')
    plt.grid()
    plt.show()
    '''準確率曲線
    fig7, ax7 = plt.subplots(figsize=(10, 7))
    plt.plot(Accuracy)
    ax7.set_xlabel('Epochs')
    ax7.set_ylabel('Accuracy Rate')
    plt.title('Train Accuracy Rate')
    plt.grid()
    plt.show()
    '''
    # ----------------------------------各個層特徵可視化-------------------------------
    # imput image
    fig2, ax2 = plt.subplots(figsize=(2, 2))
    ax2.imshow(np.reshape(mnist.train.images[11], (28, 28)))
    plt.show()

    # 第一層的卷積輸出的特徵圖
    input_image = mnist.train.images[11:12]
    conv1_6 = sess.run(inference.relu1, feed_dict={x: input_image})  # [1, 28, 28 ,6]
    conv1_transpose = sess.run(tf.transpose(conv1_6, [3, 0, 1, 2]))
    fig3, ax3 = plt.subplots(nrows=1, ncols=6, figsize=(6, 1))
    for i in range(6):
        ax3[i].imshow(conv1_transpose[i][0])  # tensor的切片[row, column]

    plt.title('Conv1 6x28x28')
    plt.show()

    # 第一層池化後的特徵圖
    pool1_6 = sess.run(inference.pool1, feed_dict={x: input_image})  # [1, 14, 14, 6]
    pool1_transpose = sess.run(tf.transpose(pool1_6, [3, 0, 1, 2]))
    fig4, ax4 = plt.subplots(nrows=1, ncols=6, figsize=(6, 1))
    for i in range(6):
        ax4[i].imshow(pool1_transpose[i][0])

    plt.title('Pool1 6x14x14')
    plt.show()

    # 第二層卷積輸出特徵圖
    conv2_16 = sess.run(inference.relu2, feed_dict={x: input_image})  # [1, 14, 14, 16]
    conv2_transpose = sess.run(tf.transpose(conv2_16, [3, 0, 1, 2]))
    fig5, ax5 = plt.subplots(nrows=1, ncols=16, figsize=(16, 1))
    for i in range(16):
        ax5[i].imshow(conv2_transpose[i][0])
    plt.title('Conv2 16x14x14')
    plt.show()

    # 第二層池化後的特徵圖
    pool2_16 = sess.run(inference.pool2, feed_dict={x: input_image})  # [1, 7, 7, 16]
    pool2_transpose = sess.run(tf.transpose(pool2_16, [3, 0, 1, 2]))
    fig6, ax6 = plt.subplots(nrows=1, ncols=16, figsize=(16, 1))
    plt.title('Pool2 16x7x7')
    for i in range(16):
        ax6[i].imshow(pool2_transpose[i][0])

    plt.show()

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章