背景
本文主要介紹如何基於Docker的TensorFlow Serving快速部署訓練好的模型,以對外提供服務。部署在線服務(Serving)官方推薦使用 SavedModel 格式,而部署到手機等移動端的模型一般使用 FrozenGraphDef 格式。
本文訓練一個神經網絡模型來分類衣服的圖像,衣服類別有比如運動鞋和襯衫等,並用 TensorFlow Serving 將其部署到線上。
模型訓練
導入依賴:
# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras
# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
import os
import subprocess
tf.logging.set_verbosity(tf.logging.ERROR)
print(tf.__version__)
導入數據
本文采用Fashion MNIST dataset。該數據集有70,000張灰度圖像,分類類別爲10,每張圖像分辨率是28 * 28 pixels。數據如下圖所示:
加載數據:
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
# scale the values to 0.0 to 1.0
train_images = train_images / 255.0
test_images = test_images / 255.0
# reshape for feeding into the model
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)
test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
print('\ntrain_images.shape: {}, of {}'.format(train_images.shape, train_images.dtype))
print('test_images.shape: {}, of {}'.format(test_images.shape, test_images.dtype))
運行結果:
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
32768/29515 [=================================] - 0s 0us/step
40960/29515 [=========================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26427392/26421880 [==============================] - 0s 0us/step
26435584/26421880 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
16384/5148 [===============================================================================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4423680/4422102 [==============================] - 0s 0us/step
4431872/4422102 [==============================] - 0s 0us/step
train_images.shape: (60000, 28, 28, 1), of float64
test_images.shape: (10000, 28, 28, 1), of float64
訓練模型
model = keras.Sequential([
keras.layers.Conv2D(input_shape=(28,28,1), filters=8, kernel_size=3,
strides=2, activation='relu', name='Conv1'),
keras.layers.Flatten(),
keras.layers.Dense(10, activation=tf.nn.softmax, name='Softmax')
])
model.summary()
testing = False
epochs = 5
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=epochs)
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('\nTest accuracy: {}'.format(test_acc))
運行結果:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
Conv1 (Conv2D) (None, 13, 13, 8) 80
_________________________________________________________________
flatten (Flatten) (None, 1352) 0
_________________________________________________________________
Softmax (Dense) (None, 10) 13530
=================================================================
Total params: 13,610
Trainable params: 13,610
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
60000/60000 [==============================] - 8s 140us/sample - loss: 0.5265 - acc: 0.8204
Epoch 2/5
60000/60000 [==============================] - 6s 96us/sample - loss: 0.3753 - acc: 0.8688
Epoch 3/5
60000/60000 [==============================] - 6s 94us/sample - loss: 0.3423 - acc: 0.8788
Epoch 4/5
60000/60000 [==============================] - 6s 94us/sample - loss: 0.3207 - acc: 0.8856
Epoch 5/5
60000/60000 [==============================] - 6s 94us/sample - loss: 0.3069 - acc: 0.8906
10000/10000 [==============================] - 1s 70us/sample - loss: 0.3464 - acc: 0.8772
Test accuracy: 0.877200007439
保存模型
爲了在TensorFlow Serving中加載已經訓練的模型,需要將訓練的模型以SaveModel格式進行保存。
# Fetch the Keras session and save the model
# The signature definition is defined by the input and output tensors,
# and stored with the default serving key
import tempfile
MODEL_DIR = tempfile.gettempdir()
version = 1
export_path = os.path.join(MODEL_DIR, str(version))
print('export_path = {}\n'.format(export_path))
if os.path.isdir(export_path):
print('\nAlready saved a model, cleaning up\n')
!rm -r {export_path}
tf.saved_model.simple_save(
keras.backend.get_session(),
export_path,
inputs={'input_image': model.input},
outputs={t.name:t for t in model.outputs})
print('\nSaved model:')
!ls -l {export_path}
模型保存的位置是/tmp/1
。該目錄下有:saved_model.pb variables
其中variables
的目錄有以下2個文件:
variables.data-00000-of-00001 variables.index
檢查保存的模型
使用saved_model_cli
命令來檢測SaveModel中的MetaGraphDefs
和SignatureDefs
。SavedModel包含一個或多個MetaGraphDef
,由其標籤集進行標識。要提供模型,如果想知道每個模型中的 SignatureDef 是什麼類型的,它們的輸入和輸出是什麼。可以通過show 命令,按層次順序檢查 SavedModel 的內容。
saved_model_cli show --dir {export_path} --all
檢查結果:
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['input_image'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 28, 28, 1)
name: Conv1_input:0
The given SavedModel SignatureDef contains the following output(s):
outputs['Softmax/Softmax:0'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 10)
name: Softmax/Softmax:0
Method name is: tensorflow/serving/predict
使用TensorFlow Serving部署服務
安裝tensorflow-model-server
最簡單的安裝方式是採用docker。本文這裏採用原生的安裝方式。
先更新apt源:
# This is the same as you would do from your command line, but without the [arch=amd64], and no sudo
# You would instead do:
# echo "deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list && \
# curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -
echo "deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | tee /etc/apt/sources.list.d/tensorflow-serving.list && \
curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | apt-key add -
!apt update
更新過程信息:
deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2943 100 2943 0 0 7827 0 --:--:-- --:--:-- --:--:-- 7827
OK
Get:1 http://storage.googleapis.com/tensorflow-serving-apt stable InRelease [3,012 B]
Get:2 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease [3,626 B]
Ign:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 InRelease
Get:4 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ Packages [70.5 kB]
Get:5 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server-universal amd64 Packages [365 B]
Ign:6 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 InRelease
Get:7 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Release [564 B]
Get:8 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Release [564 B]
Get:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Release.gpg [819 B]
Get:10 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Release.gpg [833 B]
Get:11 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic InRelease [21.3 kB]
Get:12 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Get:13 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 Packages [357 B]
Get:14 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Packages [113 kB]
Hit:15 http://archive.ubuntu.com/ubuntu bionic InRelease
Get:16 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Packages [19.8 kB]
Get:17 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Get:18 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic InRelease [15.4 kB]
Get:19 http://security.ubuntu.com/ubuntu bionic-security/restricted amd64 Packages [9,585 B]
Get:20 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic/main amd64 Packages [31.7 kB]
Get:21 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [769 kB]
Get:22 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]
Get:23 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main Sources [1,686 kB]
Get:24 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [959 kB]
Get:25 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [662 kB]
Get:26 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main amd64 Packages [810 kB]
Get:27 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [1,288 kB]
Get:28 http://security.ubuntu.com/ubuntu bionic-security/multiverse amd64 Packages [5,230 B]
Get:29 http://archive.ubuntu.com/ubuntu bionic-updates/multiverse amd64 Packages [8,284 B]
Get:30 http://archive.ubuntu.com/ubuntu bionic-updates/restricted amd64 Packages [20.3 kB]
Get:31 http://archive.ubuntu.com/ubuntu bionic-backports/universe amd64 Packages [4,227 B]
Fetched 6,755 kB in 9s (735 kB/s)
Reading package lists... Done
Building dependency tree
Reading state information... Done
107 packages can be upgraded. Run 'apt list --upgradable' to see them.
安裝TensorFlow Serving
這裏採用apt的安裝方式:
apt-get install tensorflow-model-server
安裝過程信息:
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
tensorflow-model-server
0 upgraded, 1 newly installed, 0 to remove and 107 not upgraded.
Need to get 151 MB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 tensorflow-model-server all 1.14.0 [151 MB]
Fetched 151 MB in 3s (46.3 MB/s)
Selecting previously unselected package tensorflow-model-server.
(Reading database ... 131183 files and directories currently installed.)
Preparing to unpack .../tensorflow-model-server_1.14.0_all.deb ...
Unpacking tensorflow-model-server (1.14.0) ...
Setting up tensorflow-model-server (1.14.0) ...
啓動TensorFlow Serving
以下采用REST方式啓動(另一種是gRPC)。
%%bash --bg
nohup tensorflow_model_server \
--rest_api_port=8501 \
--model_name=fashion_model \
--model_base_path="${MODEL_DIR}" >server.log 2>&1
其中bg
是 將進程搬到後臺運行(Background)
查看日誌信息:tail server.log
日誌信息詳情:
2019-09-19 02:18:22.039966: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2019-09-19 02:18:22.041055: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-09-19 02:18:22.054399: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:202] Restoring SavedModel bundle.
2019-09-19 02:18:22.066228: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:311] SavedModel load for tags { serve }; Status: success. Took 27522 microseconds.
2019-09-19 02:18:22.066280: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:103] No warmup data file found at /tmp/1/assets.extra/tf_serving_warmup_requests
2019-09-19 02:18:22.066354: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: fashion_model version: 1}
2019-09-19 02:18:22.067453: I tensorflow_serving/model_servers/server.cc:324] Running gRPC ModelServer at 0.0.0.0:8500 ...
[warn] getaddrinfo: address family for nodename not supported
2019-09-19 02:18:22.068189: I tensorflow_serving/model_servers/server.cc:344] Exporting HTTP/REST API at:localhost:8501 ...
[evhttp_server.cc : 239] RAW: Entering the event loop ...
構建請求
(1)查看待識別的數據:
def show(idx, title):
plt.figure()
plt.imshow(test_images[idx].reshape(28,28))
plt.axis('off')
plt.title('\n\n{}'.format(title), fontdict={'size': 16})
import random
rando = random.randint(0,len(test_images)-1)
show(rando, 'An Example Image: {}'.format(class_names[test_labels[rando]]))
結果如下:
(2)數據封裝
將上述圖像進行封裝,下述封裝3張圖像:
import json
data = json.dumps({"signature_name": "serving_default", "instances": test_images[0:3].tolist()})
print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))
運行結果:
Data: {"instances": [[[[0.0], [0.0], [0.0], [0.0], [0.0] ... 0.0], [0.0]]]], "signature_name": "serving_default"}
(3)創建REST請求
如果沒有安裝requests,則先安裝pip install -q requests
。
以POST方式向服務方的REST端發送請求。如果沒有指定特殊的可服務版本,默認向最新的版本請求。
import requests
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']
show(0, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
class_names[np.argmax(predictions[0])], test_labels[0], class_names[np.argmax(predictions[0])], test_labels[0]))
注意:這裏URI中的v1
是指版本號,由於本文的模型其實只有一個版本,所以就只能是v1,當存在其他版本模型時候是可以直接修改的。
URI中的models
是固定的,fashion_model
是之前啓動tensorflow_model_server
時候指定--model_name
參數。
返回結果:
特定版本的服務
讓我們指定servable的一個特定版本。由於只有一個版本,我們選擇version 1。我們還將查看所有這三個結果。
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model/versions/1:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']
for i in range(0,3):
show(i, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
class_names[np.argmax(predictions[i])], test_labels[i], class_names[np.argmax(predictions[i])], test_labels[i]))
運行結果: