【TensorFlow】Keras機器學習基礎知識-基本圖像分類

本節使用tf.keras來訓練一個神經網絡模型，用於分類衣物圖像（如運動鞋和襯衫），它是一個在TensorFlow中構建和訓練模型的高級API。

from __future__ import absolute_import, division, print_function, unicode_literals

# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt

print(tf.__version__)

上述代碼輸出：
2.0.0

1 導入Fashion MNIST數據集

Fashion MNIST數據集中包含10個類別的70,000張衣物的灰度圖像，分辨率爲28×28像素，如下圖所示：

Figure 1. Fashion-MNIST samples (by Zalando, MIT License).

Fashion MNIST數據集的目的是替代經典的MNIST數據集（這類似於編程學習裏面的“Hello World”。經典的MNIST數據集包含手寫數字（0、1、2等）的圖像，其格式與這裏使用的衣物數據集相同。

這裏，使用60,000張圖像來訓練網絡，使用10,000張圖像來評估網絡圖像分類網絡的準確性。通過下列語句可以直接從TensorFlow訪問、導入和加載Fashion MNIST數據：

fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

上述代碼輸出：
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
32768/29515 [=================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26427392/26421880 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
8192/5148 [===============================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4423680/4422102 [==============================] - 0s 0us/step

上述語句從TensorFlow加載數據集並返回四個NumPy數組：

train_images和train_labels數組是訓練集：是模型用來學習的數據；
test_images和test_labels數組是測試集：通過測試集對模型進行測試。

這些圖像是28x28維的NumPy數組，像素值分佈在0到255之間。標籤（Label）是一個整數數組，範圍從0到9，與圖像所代表的衣物類別相對應：

Label	Class
0	T-shirt/top
1	Trouser
2	Pullover
3	Dress
4	Coat
5	Sandal
6	Shirt
7	Sneaker
8	Bag
9	Ankle boot

每幅圖像都對應一個標籤。由於類名沒有包含在數據集中，所以將它們存儲在class_names中，以便後面繪製圖像時使用:

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

2 數據細節

在訓練模型之前，讓我們先研究一下數據集的格式。可以通過以下命令輸出數據集的部分細節：

train_images.shape 
len(train_labels)
train_labels
test_images.shape
len(test_labels)

對應輸出如下：

(60000, 28, 28)                                  - 訓練集中有60000張圖像，每張圖像的大小都爲28x28像素
60000                                            - 訓練集中有60000個標籤
array([9, 0, 0, ..., 3, 0, 5], dtype=uint8)      - 每個標籤都是0到9之間的整數
(10000, 28, 28)                                  - 測試集中有10,000張圖像。同樣，每個圖像表示爲28 x 28像素
10000                                            - 測試集包含10,000個圖像標籤

3 處理數據

在訓練網絡之前，必須對數據進行預處理。如果你檢查訓練集中的第一個圖像，你會看到像素值落在0到255的範圍內。可通過下列代碼顯示圖像：

plt.figure()
plt.imshow(train_images[0])
plt.colorbar()
plt.grid(False)
plt.show()

得到下圖：

在將這些值輸入神經網絡模型之前，需要將訓練集和測試集中圖像的像素值縮放到0到1的範圍。可通過以下代碼實現：

train_images = train_images / 255.0
test_images = test_images / 255.0

爲了驗證數據的格式是否正確，我們通過以下代碼顯示來自訓練集的前25個圖像，並在每個圖像下面顯示類名：

plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.xlabel(class_names[train_labels[i]])
plt.show()

4 構建（配置）模型

建立神經網絡需要配置模型的層，然後編譯模型，下面分別實現。

4.1 設置模型的層（layer）

layer是神經網絡的基本組件，它從輸入的數據中提取數據的特徵表示。

大多數深度學習是由簡單的layer鏈接在一起構成的。大多數layer（如tf.keras.layers.Dense），包含有在訓練中學習的參數。我們使用以下代碼構建本節的模型：

model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(10, activation='softmax')
])

tf.keras.layers.Flatten：這是網絡的第一層，將圖像的格式從一個二維數組（28×28像素）轉換爲一個一維數組（28 * 28 = 784像素）。可以把這個層看作是將圖像中的像素行分解並排列起來。這一層沒有需要學習的參數，它只是重新格式化數據。
當像素被格式化後，網絡由兩個tf.keras.layers.Dense組成的序列組成。這些layer緊密相連，或者說完全相連：
- 第一個Dense層有128個節點（或神經元）；
- 第二個Dense層（也即最後一層）是一個有10個節點的softmax層，它返回一個10個概率值的數組，這些概率值的和爲1。每個節點包含一個分數，表示當前圖像屬於10個類之一的概率。

4.2 編譯模型

在對模型進行訓練之前，需要額外設置一些參數。這些是在模型的編譯步驟中添加的：

損失函數（Loss function）：用來衡量訓練過程中模型的準確性，模型訓練時通過最小化這個函數來”引導“模型朝正確的方向前進；
優化器（Optimizer）：是模型根據數據和損失函數進行更新的方式；
度量（Metrics）：用於監視訓練和測試步驟。下面的例子使用accuracy度量，即被正確分類的圖像的比例。

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

5 訓練模型

訓練神經網絡模型需要以下步驟：

將訓練數據輸入模型。在本例中，訓練數據存放在train_images和train_tags數組中；
模型通過學習把圖像和標籤聯繫起來；
讓模型對本例中的測試集test_images數組進行預測。驗證預測是否與test_labels數組中的標籤匹配。

要開始訓練，使用model.fit方法：

model.fit(train_images, train_labels, epochs=10)

輸出如下：

Train on 60000 samples
Epoch 1/10
60000/60000 [==============================] - 5s 85us/sample - loss: 0.4978 - accuracy: 0.8245
Epoch 2/10
60000/60000 [==============================] - 4s 69us/sample - loss: 0.3798 - accuracy: 0.8624
Epoch 3/10
60000/60000 [==============================] - 4s 62us/sample - loss: 0.3411 - accuracy: 0.8762
Epoch 4/10
60000/60000 [==============================] - 4s 61us/sample - loss: 0.3164 - accuracy: 0.8838
Epoch 5/10
60000/60000 [==============================] - 4s 61us/sample - loss: 0.2956 - accuracy: 0.8902
Epoch 6/10
60000/60000 [==============================] - 4s 64us/sample - loss: 0.2815 - accuracy: 0.8955
Epoch 7/10
60000/60000 [==============================] - 4s 65us/sample - loss: 0.2691 - accuracy: 0.9009
Epoch 8/10
60000/60000 [==============================] - 4s 62us/sample - loss: 0.2579 - accuracy: 0.9029
Epoch 9/10
60000/60000 [==============================] - 4s 63us/sample - loss: 0.2485 - accuracy: 0.9062
Epoch 10/10
60000/60000 [==============================] - 4s 60us/sample - loss: 0.2388 - accuracy: 0.9100

<tensorflow.python.keras.callbacks.History at 0x7fefe642a860>

當模型訓練時，會輸出顯示損失（loss）和精度（accuracy）度量指標。該模型的精度約爲0.91（或91%）。

6 評估模型精度

接下來，比較模型在測試數據集上的執行情況：

test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)
print('\nTest accuracy:', test_acc)

輸出如下：

10000/1 - 1s - loss: 0.2934 - accuracy: 0.8830
Test accuracy: 0.883

結果表明，測試數據集的準確性略低於訓練數據集的準確性。這種訓練精度和測試精度之間的差距表示過擬合。過擬合是指機器學習模型在新的、以前未見過的輸入上的表現不如在訓練數據上的表現。

7 模型預測

通過訓練模型，可以使用它對一些圖像進行預測：

predictions = model.predict(test_images)

這裏，模型已經預測了測試集中每張圖片的標籤，讓我們看一下第一個預測：

predictions[0]

輸出如下：

array([1.06123218e-06, 8.76374884e-09, 4.13958730e-07, 9.93547733e-09,
   2.39135318e-07, 2.61428091e-03, 2.91701099e-07, 6.94991834e-03,
   1.02351805e-07, 9.90433693e-01], dtype=float32)

預測結果是一個由10個數字組成的數組。它們代表了模特的“置信度（confidence）”，即圖像對應於10件不同的衣服中的每一件。你可以看到哪個標籤的置信度最高：

np.argmax(predictions[0])

輸出爲：

因此，模型最確信這是一個Ankle boot，或class_names[9]。

我們將這張圖繪製出來查看完整的10個類預測的置信度，下面定義2個函數用於繪製數據圖像：

def plot_image(i, predictions_array, true_label, img):
  predictions_array, true_label, img = predictions_array, true_label[i], img[i]
  plt.grid(False)
  plt.xticks([])
  plt.yticks([])

  plt.imshow(img, cmap=plt.cm.binary)

  predicted_label = np.argmax(predictions_array)
  if predicted_label == true_label:
    color = 'blue'
  else:
    color = 'red'

  plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label],
                                100*np.max(predictions_array),
                                class_names[true_label]),
                                color=color)

def plot_value_array(i, predictions_array, true_label):
  predictions_array, true_label = predictions_array, true_label[i]
  plt.grid(False)
  plt.xticks(range(10))
  plt.yticks([])
  thisplot = plt.bar(range(10), predictions_array, color="#777777")
  plt.ylim([0, 1])
  predicted_label = np.argmax(predictions_array)

  thisplot[predicted_label].set_color('red')
  thisplot[true_label].set_color('blue')

讓我們看看第0張圖片、預測和預測數組。正確的預測標籤是藍色的，錯誤的預測標籤是紅色的。這個數字給出了預測標籤的百分比（滿分100）。調用前面定義的函數來繪製數據圖：

i = 0
plt.figure(figsize=(6,3))
plt.subplot(1,2,1)
plot_image(i, predictions[i], test_labels, test_images)
plt.subplot(1,2,2)
plot_value_array(i, predictions[i],  test_labels)
plt.show()

i = 12
plt.figure(figsize=(6,3))
plt.subplot(1,2,1)
plot_image(i, predictions[i], test_labels, test_images)
plt.subplot(1,2,2)
plot_value_array(i, predictions[i],  test_labels)
plt.show()

讓我們用他們的預測來繪製幾幅圖像。需要注意的是，即使在置信度非常高的情況下，模型也可能出錯。

# Plot the first X test images, their predicted labels, and the true labels.
# Color correct predictions in blue and incorrect predictions in red.
num_rows = 5
num_cols = 3
num_images = num_rows*num_cols
plt.figure(figsize=(2*2*num_cols, 2*num_rows))
for i in range(num_images):
  plt.subplot(num_rows, 2*num_cols, 2*i+1)
  plot_image(i, predictions[i], test_labels, test_images)
  plt.subplot(num_rows, 2*num_cols, 2*i+2)
  plot_value_array(i, predictions[i], test_labels)
plt.tight_layout()
plt.show()

最後，利用訓練後的模型對單個圖像進行預測。

# Grab an image from the test dataset.
img = test_images[1]
print(img.shape)

輸出如下：

(28, 28)

tf.keras模型經過優化，可以一次對一批或一組示例進行預測。因此，即使你使用的是一張圖片，你也需要將它添加到一個列表中：

# Add the image to a batch where it's the only member.
img = (np.expand_dims(img,0))
print(img.shape)

輸出如下：

(1, 28, 28)

現在預測這個圖像的正確標籤：

predictions_single = model.predict(img)
print(predictions_single)

輸出如下：

[[1.4175281e-04 8.5218921e-14 9.9798274e-01 1.7262291e-11 1.3707496e-03
  1.4081123e-14 5.0472573e-04 6.2876434e-17 3.6248435e-09 1.8519042e-13]]

plot_value_array(1, predictions_single[0], test_labels)
_ = plt.xticks(range(10), class_names, rotation=45)

model.predict返回一個對批量數據中每個圖像的列表。獲取預測我們的（唯一的）圖像塊（batch）：

np.argmax(predictions_single[0])

輸出如下：

【TensorFlow】Keras機器學習基礎知識-基本圖像分類

1 導入Fashion MNIST數據集

2 數據細節

3 處理數據

4 構建（配置）模型

4.1 設置模型的層（layer）

4.2 編譯模型

5 訓練模型

6 評估模型精度

7 模型預測

lightdb hash index的性能和限制

【OpenCV】編譯opencv_contrib模塊

極大似然估計（Maximum likelihood estimation，MLE）：用樣本估計總體參數

【C++ Primer讀書筆記】第6章 - 函數

【TensorFlow】Keras機器學習基礎知識-使用TF.Hub進行文本分類

【TensorFlow】Keras機器學習基礎知識-基本圖像分類

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結