6 TensorFlow實現cnn識別手寫數字

原創

yunyunyx

2018-08-23 10:43

————————————————————————————————————

寫在開頭：此文參照莫煩python教程（牆裂推薦！！！）

————————————————————————————————————

這個實驗的內容是：基於TensorFlow，實現手寫數字的識別。
這裏用到的數據集是大家熟知的mnist數據集。
mnist有五萬多張手寫數字的圖片，每個圖片用28x28的像素矩陣表示。所以我們的輸入層每個案列的特徵個數就有28x28=784個；因爲數字有0,1,2…9共十個，所以我們的輸出層是個1x10的向量。輸出層是十個小於1的非負數，表示該預測是0，1，2…9的概率，我們選取最大概率所對應的數字作爲我們的最終預測。
真實的數字表示爲該數字所對應的位置爲1，其餘位置爲0的1x10的向量。

下面直接貼代碼，解釋和筆記都在註釋上了！！

#卷積神經網絡（cnn）

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

#導入數據
mnist = input_data.read_data_sets('MNIST_data',one_hot=True)#如果還沒下載mnist就下載

#定義計算正確率的函數
def t_accuracy(t_xs,t_ys):
    global prediction
    y_pre = sess.run(prediction,feed_dict={xs:t_xs})
    correct_pre = tf.equal(tf.argmax(y_pre,1),tf.argmax(t_ys,1))
    accuracy = tf.reduce_mean(tf.cast(correct_pre,tf.float32))
    result = sess.run(accuracy,feed_dict={xs:t_xs,ys:t_ys})
    return result

#定義權重
def weight_variable(shape):
    initial = tf.truncated_normal(shape,stddev=0.1)
    return tf.Variable(initial)

#定義偏置
def bias_variable(shape):
    initial = tf.constant(0.1,shape=shape)
    return tf.Variable(initial)

#定義卷積神經網絡層
def conv2d(x,W):
    #strides[1,x_,y_，1]
    return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')  #x,y,z方向的跨度都爲1

#定義pooling
def max_pool_2x2(x):
    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')#strides[0]=strides[3]=1

#定義神經網絡的輸入值和輸出值
xs = tf.placeholder(tf.float32,[None,784]) #None是不規定大小，這裏指的是案例個數，而輸入特徵個數爲28x28 = 784
ys = tf.placeholder(tf.float32,[None,10]) #Nnoe也是案例個數，不做規定；10是因爲有10個數字，所以輸出是10
#keep_prob = tf.placeholder(tf.float32)  #dropout
x_image = tf.reshape(xs,[-1,28,28,1])#-1:把所有圖片的維度丟到一邊不管;28,28是像素，1是維度，因爲這裏的圖片是黑白的。輸出爲[n_samoles,28,28,1]

#定義第一層卷積層
W_conv1 = weight_variable([5,5,1,32]) #patch5x5，in_size（單位） 1，out_size（高度） 32
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image,W_conv1) + b_conv1) #輸出格式28x28x32
h_pool1 = max_pool_2x2(h_conv1)  #輸出爲14x14x32

#定義第二層卷積層
W_conv2 = weight_variable([5,5,32,64]) #patch5x5，in_size（單位） 32，out_size（高度） 64
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1,W_conv2) + b_conv2) #輸出格式14x14x64
h_pool2 = max_pool_2x2(h_conv2)  #輸出爲7x7x64

#定義第一層全連接網絡層
W_fc1 = weight_variable([7*7*64,1024])
b_fc1 = bias_variable([1024])
#將h_pool2展平
h_pool2_flat = tf.reshape(h_pool2,[-1,7*7*64])
h_fc1_drop = tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)
#h_fc1_drop = tf.nn.dropout(h_fc1,keep_prob)

#定義第二層全連接網絡層
W_fc2=weight_variable([1024,10]) #因爲有10個數字，所以輸出10個
b_fc2=bias_variable([10])  #因爲有十個數字，所以輸出10個
prediction=tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2)+b_fc2)  #進行分類

#計算誤差
cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys*tf.log(prediction),reduction_indices=[1])) #此誤差計算方式和softmax配套用，效果好

#訓練
train_step=tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

#開始訓練
sess = tf.Session()
sess.run(tf.global_variables_initializer())

for i in range(2000):
    batch_xs,batch_ys = mnist.train.next_batch(100)   #提取數據集的100個數據，因爲原來數據太大了
    sess.run(train_step,feed_dict={xs:batch_xs,ys:batch_ys})
    if i%200 == 0:
        print (t_accuracy(mnist.test.images,mnist.test.labels))  #每隔50個，打印一下正確率。注意：這裏是要用test的數據來測試

Extracting MNIST_data\train-images-idx3-ubyte.gz
Extracting MNIST_data\train-labels-idx1-ubyte.gz
Extracting MNIST_data\t10k-images-idx3-ubyte.gz
Extracting MNIST_data\t10k-labels-idx1-ubyte.gz
0.0993
0.9236
0.956
0.9626
0.97
0.9742
0.9778
0.9725
0.9796
0.9826

由於是在裝有強勁的顯卡的臺式機上運行的，所以幾秒就出結果了，看得也是暢快！！訓練了2000次，效果就有98.26%了，算不錯吧？

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

6 TensorFlow實現cnn識別手寫數字

今天！通義靈碼在北京、成都、杭州三城開講啦

【BI 可視化插件】怎麼做？手把手教你實現

2 TensorFlow入門筆記之建造神經網絡並將結果可視化

3 TensorFlow入門之識別手寫數字

ACM解題之快速輸出楊輝三角形（前68行）

MFC實現文字隨鼠標移動

1 TensorFlow入門筆記之基礎架構

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結