TF-day1 MINIST識別數字

原創

潘小榭

2020-06-21 03:53

當我們開始學習編程的時候，第一件事往往是學習打印”Hello World”。就好比編程入門有Hello World，機器學習入門有MNIST

主要步驟
- 獲取數據
- 建立模型
- 定義 tensor，variable：X，W，b
- 定義損失函數，優化器：cross－entropy，gradient descent
- 訓練模型：loop，batch
- 評價：準確率

一.獲取MINIST 數據集

數據來自於http://yann.lecun.com/exdb/mnist/
數據分爲train, validate, test三部分

from tensorflow.examples.tutorials.mnist import input_data
minist = input_data.read_data_sets("MINIST_data/",one_hot=True)
print(minist)
##
Datasets(train=<tensorflow.contrib.learn.python.learn.datasets.mnist.DataSet object at 0x7f02cbc0cd30>, validation=<tensorflow.contrib.learn.python.learn.datasets.mnist.DataSet object at 0x7f02df872518>, test=<tensorflow.contrib.learn.python.learn.datasets.mnist.DataSet object at 0x7f02e1188a90>)

print(np.shape(minist.test.images))
print(np.shape(minist.test.labels))
print(np.shape(minist.train.images))
print(np.shape(minist.train.labels))
print(np.shape(minist.validation.images))
print(np.shape(minist.validation.labels))
###對應數據集
(10000, 784)
(10000, 10)
(55000, 784)
(55000, 10)
(5000, 784)
(5000, 10)

可以看到訓練數據是(10000,784) ,對應的標籤是(10000,10)
每一張圖片包含28*28個像素點,故用一個數字數組來表示這張圖片,我們把這個數組展開成一個向量，長度是 28x28 = 784.
相對應的MNIST數據集的標籤是介於0到9的數字，標籤數據是”one-hot vectors”。一個one-hot向量除了某一位的數字是1以外其餘各維度數字都是0。比如，標籤0將表示成([1,0,0,0,0,0,0,0,0,0,0])

目標：給了 X 後，預測它的 label 是屬於 0～9 類中的哪一類

如果想要看數據屬於多類中的哪一類，首先可以想到用 softmax 來做。

二.建立模型
softmax regression 有兩步：

把 input 轉化爲某類的 evidence
把 evidence 轉化爲 probabilities

1. 把 input 轉化爲某類的 evidence
某一類的 evidence 就是像素強度的加權求和，再加上此類的 bias。
如果某個 pixel 可以作爲一個 evidence證明圖片不屬於此類，則 weight 爲負，否則的話 weight 爲正。下圖中，紅色代表負值，藍色代表正值：

2. 把 evidence 轉化爲 probabilities

簡單看，softmax 就是把 input 先做指數，再做一下歸一：

歸一的作用：好理解，就是轉化成概率的性質
爲什麼要取指數：《常用激活函數比較》http://www.jianshu.com/p/22d9720dbf1a
第一個原因是要模擬 max 的行爲，所以要讓大的更大。
第二個原因是需要一個可導的函數。

用圖片表示:

用公式表示:

代碼表示:

##實現迴歸模型
y = tf.nn.softmax(tf.matmul(x,w) + b)

三.定義 tensor 和 variable：

x = tf.placeholder(tf.float32,[None,784])
w = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))

三.定義損失函數\優化器
這裏採用成本函數是“交叉熵”（cross-entropy）。

y 是預測的概率分佈, y’ 是實際的分佈（我們輸入的one-hot vector)。

##訓練模型
y_ = tf.placeholder("float",[None,10])
cross_entropy = -tf.reduce_sum(y_*tf.log(y))

然後用 backpropagation，且 gradient descent 作爲優化器，來訓練模型，使得 loss 達到最小：

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

四.訓練模型

for _ in range(1000):
  batch_xs, batch_ys = mnist.train.next_batch(100)
  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

五.評價模型

###評估模型
correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,"float"))
print(sess.run(accuracy,feed_dict={x:minist.test.images,y_: minist.test.labels}))

完整代碼:

# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

"""Functions for downloading and reading MNIST data."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import gzip
import os
import tempfile

import numpy as np
from six.moves import urllib
from six.moves import xrange  # pylint: disable=redefined-builtin
import tensorflow as tf
from tensorflow.contrib.learn.python.learn.datasets.mnist import read_data_sets

##導入數據
from tensorflow.examples.tutorials.mnist import input_data
minist = input_data.read_data_sets("MINIST_data/",one_hot=True)

##實現迴歸模型
x = tf.placeholder(tf.float32,[None,784])
w = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x,w) + b)

##訓練模型
y_ = tf.placeholder("float",[None,10])
cross_entropy = -tf.reduce_sum(y_*tf.log(y))

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

for i in range(1000):
    batch_xs, batch_ys = minist.train.next_batch(100)
    sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys})

###評估模型
correct_prediction = tf.equal(tf.argmax(y,1),tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,"float"))
print(sess.run(accuracy,feed_dict={x:minist.test.images,y_: minist.test.labels}))

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

TF-day1 MINIST識別數字

cs231n knn

TF-day1 MINIST識別數字

C++ Prime plus 類和對象

TF-day6 CNN簡單分類

什麼是雲計算？

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結