一、認知概念
https://blog.csdn.net/Raoodududu/article/details/82287099
二、 常用結構的tensorflow運用
1. 卷積層
'''通過tf.get_variable的方式創建過濾器的權重變量和偏置變量。'''
#權重變量:前兩個維度過濾器尺寸,第三個維度當前層的深度,第四個維度表示過濾器的深度
filter_weight = tf.get_cariable('weights', [5, 5, 3, 16], initializer = tf.truncated_normal_initializer(stddev=0.1))
#偏置變量:過濾器深度
biases = tf.get_variable('biases', [16], initializer=tf.constant_initializer(0.1))
'''卷積層前向傳播算法'''
#參數1:輸入的節點矩陣[batch, _, _, _];參數2:權重;
#參數3:不同維上的步長[1,_, _, 1],即只對長和寬卷積步長。
#參數4:SAME表示全0填充,VALID表示不添加。
conv = tf.nn.conv2d(input, filter_weight, strides=[1, 1, 1, 1], padding='SAME')
bias = tf.nn.bias_add(conv, biases)
actives_conv = tf.nn.relu(bias)
2.池化層
'''最大池化層的前向傳播'''
#參數1:當前層的節點矩陣;參數2:過濾器尺寸[1, _, _, 1]不可跨樣例和節點矩陣深度
#參數3,參數4:同上
pool = tf.nn.max_pool(actived_conv, ksize=[1, 5, 5, 1], strides=[1, 2, 2, 1],padding='SAME')
3.區別
卷積層使用過濾器是橫跨整個深度,而池化層使用過濾器隻影響一個深度節點,所以池化層深度與輸入深度相同。
三、經典卷積網絡模型
1.LeNet-5模型
優點:大幅提高準確率
缺點:無法很好處理類似ImageNet這樣比較大的圖像數據集
C1層(卷積層):輸入層32×32×1;濾波器尺寸5×5,深度6;不使用全0填充,步長1。
1. 對應特徵圖大小:特徵圖的大小28*28,這樣能防止輸入的連接掉到邊界之外(32-5+1=28)。
feature map邊長大小的具體計算參見:https://blog.csdn.net/Raoodududu/article/details/82287099
2. 參數個數:C1有156個可訓練參數 (每個濾波器5*5=25個unit參數和一個bias參數,一共6個濾波器,共(5*5*1+1)*6=156個參數)
3. 鏈接個數/FLOPS個數::(28*28)*(5*5+1)*6=122,304個。左邊是C1層每個feature map的神經元個數,右邊是濾波器在輸入層滑過的神經元個數,相乘即爲連接數。(每個鏈接對應1次計算,由wa+b可知,每個參數參與1次計算,所以1個單位的偏置b也算進去)
----------------------------------------
S2層(下采樣層):輸入層28×28×6;濾波器尺寸2×2;步長2。
1. 對應特徵圖大小:特徵圖大小14*14
2. 參數個數:S2層有 12個 (6*(1+1)=12) 可訓練參數。S2層 每個濾波器路過的4個鄰域 的4個輸入相加,乘以1個可訓練參數w,再加上1個可訓練偏置b(即一個濾波器對應兩個參數)。(對於子採樣層,每一個特徵映射圖的的可變參數需要考慮你使用的採樣方式而定,如文中的採樣方式,每一個特徵映射圖的可變參數數量爲2個,有的採樣方式不需要參數)
3. 鏈接個數/FLOPS個數:5880個連接,( 14*14* (2*2+1)*6 =5880) 。左邊是濾波器在C1層滑過的神經元個數,右邊是S2層每個feature map的神經元個數,相乘即爲連接數。
----------------------------------------
C3層(卷積層):輸入層14×14×6;濾波器尺寸5×5,深度16;不使用全0填充,步長1。
1. 對應特徵圖大小:特徵圖大小10*10(14-5+1=10)
2. 參數個數:(5×5×6+1)×16
3. 鏈接個數/FLOPS個數: 10*10*(5×5×6+1)×16=41600個連接。左邊是C3層特徵圖大小,右邊是濾波器滑過的S2層神經元個數。
------------------------------------------
S4層(下采樣層):輸入層10*10*16,;過濾器尺寸2*2,;全0填充,步長2。
1. 對應特徵圖大小:對應特徵圖大小5*5。
2. 參數個數:S4層有32個可訓練參數。(每個特徵圖1個因子w和1個偏置b,16*(1+1)=32)
3. 鏈接個數/FLOPS個數:5*5× (2*2+1)*16 =2000個連接。左邊是S4層神經元個數,右邊是濾波器在C3層滑過的神經元個數,相乘即爲連接數。
--------------------------------------------
C5層(卷積層或第一個全連接層):輸入層5*5*16;過濾器尺寸5*5,深度120,步長1。
1. 對應特徵圖大小:特徵圖的大小爲1*1。(5-5+1=1), 這構成了S4和C5之間的全連接。之所以仍將C5標示爲卷積層而非全相聯層,是因爲如果LeNet-5的輸入變大,而其他的保持不變,那麼此時特徵圖的維數就會比1*1大。
2. 參數個數:(16*5*5+1)*120=48120個。濾波器個數120*16個,所以w有120*16*5*5個,同組16個濾波器共用一個b,所以有120個b。
3. 鏈接個數/FLOPS個數:1*1×48120,左邊是C5層特徵圖大小(其實現在已經變成了單個神經元,大小1*1), 右邊是濾波器滑過的神經元個數,相乘即爲連接數,此處也即FLOPS個數。
--------------------------------------------
F6層(全連接層):輸入120節點,輸出84個節點。
1. 對應特徵圖大小:有84個單元(之所以選這個數字的原因來自於輸出層的設計),與C5層全相連。
2. 參數個數:有 84* (120*(1*1)+1)=10164 個可訓練參數。如同經典神經網絡,F6層計算輸入向量(120)和權重向量(1*1)之間的點積,再加上一個偏置(+1)。然後將其傳遞給sigmoid函數產生單元i的一個狀態。
3. 鏈接個數/FLOPS個數:1*1×10164,左邊是F6層特徵圖大小,右邊是濾波器在C5層滑過的神經元個數。
--------------------------------------------
輸出層:輸入84個節點,輸出節點10個。
參數個數:(10+1)×84。
--------------------------------------------
2.LeNet-5代碼
(1)inference.py
import tensorflow as tf
# 1.配置參數
INPUT_NODE = 784
OUTPUT_NODE = 10
IMAGE_SIZE = 28
NUM_CHANNELS = 1
NUM_LABELS = 10
# 第一層卷積層的尺寸和深度
CONV1_DEEP = 32
CONV1_SIZE = 5
# 第二層卷積層的尺寸和深度
CONV2_DEEP = 64
CONV2_SIZE = 5
# 全連接層的節點個數
FC_SIZE = 512
# 2.定義前向傳播的過程
# 這裏添加了一個新的參數train,用於區分訓練過程和測試過程。
# 在這個程序中將用到dropout方法,dropout可以進一步提升模型可靠性並防止過擬合
# dropout過程只在訓練時使用
def inference(input_tensor, train, regularizer):
# 定義的卷積層輸入爲28*28*1的原始MNIST圖片像素,使用全0填充後,輸出爲28*28*32
with tf.variable_scope('layer1-conv1'):
conv1_weights = tf.get_variable("weight", [CONV1_SIZE, CONV1_SIZE, NUM_CHANNELS, CONV1_DEEP], initializer=tf.truncated_normal_initializer(stddev=0.1))
conv1_biases = tf.get_variable("bias", [CONV1_DEEP], initializer=tf.constant_initializer(0.0))
conv1 = tf.nn.conv2d(input_tensor, conv1_weights, strides=[1, 1, 1, 1], padding='SAME')
relu1 = tf.nn.relu(tf.nn.bias_add(conv1, conv1_biases))
# 實現第二層池化層的前向傳播過程。這一層輸入爲14*14*32
with tf.name_scope('layer2-pool1'):
pool1 = tf.nn.max_pool(relu1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
# 實現第三層卷積層
with tf.variable_scope('layer3-conv2'):
conv2_weights = tf.get_variable("weight", [CONV2_SIZE, CONV2_SIZE, CONV1_DEEP, CONV2_DEEP], initializer=tf.truncated_normal_initializer(stddev=0.1))
conv2_biases = tf.get_variable("bias", [CONV2_DEEP], initializer=tf.constant_initializer(0.0))
conv2 = tf.nn.conv2d(pool1, conv2_weights, strides=[1, 1, 1, 1], padding='SAME')
relu2 = tf.nn.relu(tf.nn.bias_add(conv2, conv2_biases))
# 實現第四層池化層
with tf.name_scope('layer4-pool2'):
pool2 = tf.nn.max_pool(relu2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
# 將第四層池化層的輸出轉化爲第五層全連接的輸入格式。
pool_shape = pool2.get_shape().as_list() #將矩陣拉直成向量
nodes = pool_shape[1] * pool_shape[2] * pool_shape[3] #計算向量長度:即長×寬×深度
reshaped = tf.reshape(pool2, [pool_shape[0], nodes]) #將第四層輸出變成一個batch向量;batch[0]爲一個batch中數據的個數。
# 聲明第五層全連接層的變量並實現前向傳播過程
with tf.variable_scope('layer5-fc1'):
fc1_weights = tf.get_variable("weight", [nodes, FC_SIZE], initializer=tf.truncated_normal_initializer(stddev=0.1))
if regularizer != None: #只有全連接層的權重需要正則化
tf.add_to_collection('losses', regularizer(fc1_weights))
fc1_biases = tf.get_variable("bias", [FC_SIZE], initializer=tf.constant_initializer(0.1))
fc1 = tf.nn.relu(tf.matmul(reshaped, fc1_weights) + fc1_biases)
if train: fc1 = tf.nn.dropout(fc1, 0.5) #dropout避免過擬合:在訓練時隨即將部分節點輸出改爲0
# 聲明第六層全連接層的變量並實現前向傳播過程
with tf.variable_scope('layer6-fc2'):
fc2_weights = tf.get_variable("weight", [FC_SIZE, NUM_CHANNELS], initializer=tf.truncated_normal_initializer(stddev=0.1))
if regularizer != None:
tf.add_to_collection('losses', regularizer(fc2_weights))
fc2_biases = tf.get_variable("bias", [NUM_LABELS], initializer=tf.constant_initializer(0.1))
logit = tf.matmul(fc1, fc2_weights) + fc2_biases
return logit
(2)train.py
'''mnist_train.'''
import os
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import LeNet_5inference as inference
import numpy as np
BATCH_SIZE = 100
LEARNING_RATE_BASE = 0.05
LEARNING_RATE_DECAY = 0.99
REGULARIZATION_RATE = 0.0001
TRAINING_STEPS = 30000
MOVING_AVERAGE_DECAY = 0.99
#模型保存的路徑和文件名
MODEL_SAVE_PATH = "D:/python/pycharm/venv/LeNet"
MODEL_NAME = "model.ckpt"
def train(mnist):
x = tf.placeholder(tf.float32, [BATCH_SIZE, inference.IMAGE_SIZE, inference.IMAGE_SIZE, inference.NUM_CHANNELS], name='x-input')
y_ = tf.placeholder(tf.float32, [None, inference.OUTPUT_NODE], name='y-input')
regularizer = tf.contrib.layers.l2_regularizer(REGULARIZATION_RATE)
y = inference.inference(x, True, regularizer) #前向傳播(訓練)
global_step = tf.Variable(0, trainable=False)
variable_averages = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step)
variables_averages_op = variable_averages.apply(tf.trainable_variables())
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=tf.argmax(y_, 1), logits=y)
cross_entropy_mean = tf.reduce_mean(cross_entropy)
loss = cross_entropy_mean + tf.add_n(tf.get_collection('losses'))
learning_rate = tf.train.exponential_decay(LEARNING_RATE_BASE, global_step, mnist.train.num_examples / BATCH_SIZE, LEARNING_RATE_DECAY)
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
with tf.control_dependencies([train_step, variables_averages_op]):
train_op = tf.no_op(name='train')
#初始化TensorFlow持久化類
saver = tf.train.Saver()
with tf.Session() as sess:
tf.initialize_all_variables().run()
for i in range(5000):
xs, ys = mnist.train.next_batch(BATCH_SIZE)
reshaped_xs = np.reshape(xs, (BATCH_SIZE, inference.IMAGE_SIZE, inference.IMAGE_SIZE, inference.NUM_CHANNELS))
_, loss_value, step = sess.run([train_op, loss, global_step], feed_dict={x: reshaped_xs, y_: ys})
if i % 1000 == 0:
print("After %d training step,loss on training""batch is %g" % (step, loss_value)) #輸出當前損失函數的大小
#保存當前模型
saver.save(sess, os.path.join(MODEL_SAVE_PATH, MODEL_NAME), global_step=global_step)
def main(argv=None):
mnist = input_data.read_data_sets("D:/python/pycharm/venv/tmp/data", one_hot=True)
train(mnist)
if __name__ == '__main__':
tf.app.run()
(3)eval
import time
import math
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
import LeNet_5inference as LeNet5_infernece
import LeNet_5train as LeNet5_train
def evaluate(mnist):
with tf.Graph().as_default() as g:
# 定義輸出爲4維矩陣的placeholder
x = tf.placeholder(tf.float32, [
mnist.test.num_examples,
#LeNet5_train.BATCH_SIZE,
LeNet5_infernece.IMAGE_SIZE,
LeNet5_infernece.IMAGE_SIZE,
LeNet5_infernece.NUM_CHANNELS],
name='x-input')
y_ = tf.placeholder(tf.float32, [None, LeNet5_infernece.OUTPUT_NODE], name='y-input')
validate_feed = {x: mnist.test.images, y_: mnist.test.labels}
global_step = tf.Variable(0, trainable=False)
regularizer = tf.contrib.layers.l2_regularizer(LeNet5_train.REGULARIZATION_RATE)
y = LeNet5_infernece.inference(x, False, regularizer)
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
variable_averages = tf.train.ExponentialMovingAverage(LeNet5_train.MOVING_AVERAGE_DECAY)
variables_to_restore = variable_averages.variables_to_restore()
saver = tf.train.Saver(variables_to_restore)
#n = math.ceil(mnist.test.num_examples / LeNet5_train.BATCH_SIZE)
n = math.ceil(mnist.test.num_examples / mnist.test.num_examples)
for i in range(n):
with tf.Session() as sess:
ckpt = tf.train.get_checkpoint_state(LeNet5_train.MODEL_SAVE_PATH)
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)
global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]
xs, ys = mnist.test.next_batch(mnist.test.num_examples)
#xs, ys = mnist.test.next_batch(LeNet5_train.BATCH_SIZE)
reshaped_xs = np.reshape(xs, (
mnist.test.num_examples,
#LeNet5_train.BATCH_SIZE,
LeNet5_infernece.IMAGE_SIZE,
LeNet5_infernece.IMAGE_SIZE,
LeNet5_infernece.NUM_CHANNELS))
accuracy_score = sess.run(accuracy, feed_dict={x:reshaped_xs, y_:ys})
print("After %s training step(s), test accuracy = %g" % (global_step, accuracy_score))
else:
print('No checkpoint file found')
return
# 主程序
def main(argv=None):
mnist = input_data.read_data_sets("D:/python/pycharm/venv/tmp/data", one_hot=True)
evaluate(mnist)
if __name__ == '__main__':
main()
3.可視化
(1)def
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt # plt 用於顯示圖片
import matplotlib.image as mpimg # mpimg 用於讀取圖片
import numpy as np
import LeNet_5inference
import LeNet_5train
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("D:/python/pycharm/venv/tmp/data", one_hot=True)
img1 = mnist.train.images[1] #1.讀入圖片2: (784,)
label1 = mnist.train.labels[1] #2.讀入標籤2: [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
print(label1)
print("img_data shape = ", img1.shape)
img1.shape = [28, 28] #3.圖片轉化爲28×28矩陣
print(img1.shape)
'''顯示圖片'''
plt.imshow(img1) #熱度圖
plt.show()
plt.imshow(img1, cmap='gray') #灰度圖
plt.show()
'''多張圖顯示在一張圖上'''
plt.subplot(4,8,1) #4行8列第一個
plt.imshow(img1, cmap='gray')
plt.axis('off') #關掉座標軸
plt.subplot(4,8,9) #4行8列第九個
plt.imshow(img1, cmap='gray')
plt.axis('off')
plt.show()
'''用測試LeNet_5train'''
# 首先應該把 img1 轉爲正確的shape (None, 784)
X_img = mnist.train.images[2].reshape([-1, 784])
y_img = mnist.train.labels[2].reshape([-1, 10]) # 這個標籤只要維度一致就行了
result = LeNet_5inference.relu1.eval(feed_dict={X_: X_img, y_: y_img})
for _ in xrange(32):
show_img = result[:,:,:,_]
show_img.shape = [28, 28]
plt.subplot(4, 8, _ + 1)
plt.imshow(show_img, cmap='gray')
plt.axis('off')
plt.show()
(2)調用
# 代價函數曲線
fig1, ax1 = plt.subplots(figsize=(10, 7))
plt.plot(loss_value)
ax1.set_xlabel('step')
ax1.set_ylabel('loss_value')
plt.title('Cross Loss')
plt.grid()
plt.show()
'''準確率曲線
fig7, ax7 = plt.subplots(figsize=(10, 7))
plt.plot(Accuracy)
ax7.set_xlabel('Epochs')
ax7.set_ylabel('Accuracy Rate')
plt.title('Train Accuracy Rate')
plt.grid()
plt.show()
'''
# ----------------------------------各個層特徵可視化-------------------------------
# imput image
fig2, ax2 = plt.subplots(figsize=(2, 2))
ax2.imshow(np.reshape(mnist.train.images[11], (28, 28)))
plt.show()
# 第一層的卷積輸出的特徵圖
input_image = mnist.train.images[11:12]
conv1_6 = sess.run(inference.relu1, feed_dict={x: input_image}) # [1, 28, 28 ,6]
conv1_transpose = sess.run(tf.transpose(conv1_6, [3, 0, 1, 2]))
fig3, ax3 = plt.subplots(nrows=1, ncols=6, figsize=(6, 1))
for i in range(6):
ax3[i].imshow(conv1_transpose[i][0]) # tensor的切片[row, column]
plt.title('Conv1 6x28x28')
plt.show()
# 第一層池化後的特徵圖
pool1_6 = sess.run(inference.pool1, feed_dict={x: input_image}) # [1, 14, 14, 6]
pool1_transpose = sess.run(tf.transpose(pool1_6, [3, 0, 1, 2]))
fig4, ax4 = plt.subplots(nrows=1, ncols=6, figsize=(6, 1))
for i in range(6):
ax4[i].imshow(pool1_transpose[i][0])
plt.title('Pool1 6x14x14')
plt.show()
# 第二層卷積輸出特徵圖
conv2_16 = sess.run(inference.relu2, feed_dict={x: input_image}) # [1, 14, 14, 16]
conv2_transpose = sess.run(tf.transpose(conv2_16, [3, 0, 1, 2]))
fig5, ax5 = plt.subplots(nrows=1, ncols=16, figsize=(16, 1))
for i in range(16):
ax5[i].imshow(conv2_transpose[i][0])
plt.title('Conv2 16x14x14')
plt.show()
# 第二層池化後的特徵圖
pool2_16 = sess.run(inference.pool2, feed_dict={x: input_image}) # [1, 7, 7, 16]
pool2_transpose = sess.run(tf.transpose(pool2_16, [3, 0, 1, 2]))
fig6, ax6 = plt.subplots(nrows=1, ncols=16, figsize=(16, 1))
plt.title('Pool2 16x7x7')
for i in range(16):
ax6[i].imshow(pool2_transpose[i][0])
plt.show()