背景

本文主要介紹如何基於Docker的TensorFlow Serving快速部署訓練好的模型，以對外提供服務。部署在線服務（Serving）官方推薦使用 SavedModel 格式，而部署到手機等移動端的模型一般使用 FrozenGraphDef 格式。

本文訓練一個神經網絡模型來分類衣服的圖像，衣服類別有比如運動鞋和襯衫等，並用 TensorFlow Serving 將其部署到線上。

模型訓練

導入依賴：

# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
import os
import subprocess

tf.logging.set_verbosity(tf.logging.ERROR)
print(tf.__version__)

導入數據

本文采用Fashion MNIST dataset。該數據集有70,000張灰度圖像，分類類別爲10，每張圖像分辨率是28 * 28 pixels。數據如下圖所示：

加載數據：

fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

# scale the values to 0.0 to 1.0
train_images = train_images / 255.0
test_images = test_images / 255.0

# reshape for feeding into the model
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)
test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

print('\ntrain_images.shape: {}, of {}'.format(train_images.shape, train_images.dtype))
print('test_images.shape: {}, of {}'.format(test_images.shape, test_images.dtype))

運行結果：

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
32768/29515 [=================================] - 0s 0us/step
40960/29515 [=========================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26427392/26421880 [==============================] - 0s 0us/step
26435584/26421880 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
16384/5148 [===============================================================================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4423680/4422102 [==============================] - 0s 0us/step
4431872/4422102 [==============================] - 0s 0us/step

train_images.shape: (60000, 28, 28, 1), of float64
test_images.shape: (10000, 28, 28, 1), of float64

訓練模型

model = keras.Sequential([
  keras.layers.Conv2D(input_shape=(28,28,1), filters=8, kernel_size=3, 
                      strides=2, activation='relu', name='Conv1'),
  keras.layers.Flatten(),
  keras.layers.Dense(10, activation=tf.nn.softmax, name='Softmax')
])
model.summary()

testing = False
epochs = 5

model.compile(optimizer=tf.train.AdamOptimizer(), 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=epochs)

test_loss, test_acc = model.evaluate(test_images, test_labels)
print('\nTest accuracy: {}'.format(test_acc))

運行結果：

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
Conv1 (Conv2D)               (None, 13, 13, 8)         80        
_________________________________________________________________
flatten (Flatten)            (None, 1352)              0         
_________________________________________________________________
Softmax (Dense)              (None, 10)                13530     
=================================================================
Total params: 13,610
Trainable params: 13,610
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
60000/60000 [==============================] - 8s 140us/sample - loss: 0.5265 - acc: 0.8204
Epoch 2/5
60000/60000 [==============================] - 6s 96us/sample - loss: 0.3753 - acc: 0.8688
Epoch 3/5
60000/60000 [==============================] - 6s 94us/sample - loss: 0.3423 - acc: 0.8788
Epoch 4/5
60000/60000 [==============================] - 6s 94us/sample - loss: 0.3207 - acc: 0.8856
Epoch 5/5
60000/60000 [==============================] - 6s 94us/sample - loss: 0.3069 - acc: 0.8906
10000/10000 [==============================] - 1s 70us/sample - loss: 0.3464 - acc: 0.8772

Test accuracy: 0.877200007439

保存模型

爲了在TensorFlow Serving中加載已經訓練的模型，需要將訓練的模型以SaveModel格式進行保存。

# Fetch the Keras session and save the model
# The signature definition is defined by the input and output tensors,
# and stored with the default serving key
import tempfile

MODEL_DIR = tempfile.gettempdir()
version = 1
export_path = os.path.join(MODEL_DIR, str(version))
print('export_path = {}\n'.format(export_path))
if os.path.isdir(export_path):
  print('\nAlready saved a model, cleaning up\n')
  !rm -r {export_path}

tf.saved_model.simple_save(
    keras.backend.get_session(),
    export_path,
    inputs={'input_image': model.input},
    outputs={t.name:t for t in model.outputs})

print('\nSaved model:')
!ls -l {export_path}

模型保存的位置是/tmp/1。該目錄下有：saved_model.pb variables
其中variables的目錄有以下2個文件：

variables.data-00000-of-00001  variables.index

檢查保存的模型

使用saved_model_cli命令來檢測SaveModel中的MetaGraphDefs和SignatureDefs。SavedModel包含一個或多個MetaGraphDef，由其標籤集進行標識。要提供模型，如果想知道每個模型中的 SignatureDef 是什麼類型的，它們的輸入和輸出是什麼。可以通過show 命令，按層次順序檢查 SavedModel 的內容。

saved_model_cli show --dir {export_path} --all

檢查結果：

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['input_image'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 28, 28, 1)
        name: Conv1_input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['Softmax/Softmax:0'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 10)
        name: Softmax/Softmax:0
  Method name is: tensorflow/serving/predict

使用TensorFlow Serving部署服務

安裝tensorflow-model-server

最簡單的安裝方式是採用docker。本文這裏採用原生的安裝方式。
先更新apt源：

# This is the same as you would do from your command line, but without the [arch=amd64], and no sudo
# You would instead do:
# echo "deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list && \
# curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -

echo "deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | tee /etc/apt/sources.list.d/tensorflow-serving.list && \
curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | apt-key add -
!apt update

更新過程信息：

deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2943  100  2943    0     0   7827      0 --:--:-- --:--:-- --:--:--  7827
OK
Get:1 http://storage.googleapis.com/tensorflow-serving-apt stable InRelease [3,012 B]
Get:2 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease [3,626 B]
Ign:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease
Get:4 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ Packages [70.5 kB]
Get:5 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server-universal amd64 Packages [365 B]
Ign:6 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  InRelease
Get:7 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  Release [564 B]
Get:8 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release [564 B]
Get:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  Release.gpg [819 B]
Get:10 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release.gpg [833 B]
Get:11 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic InRelease [21.3 kB]
Get:12 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Get:13 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 Packages [357 B]
Get:14 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  Packages [113 kB]
Hit:15 http://archive.ubuntu.com/ubuntu bionic InRelease
Get:16 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Packages [19.8 kB]
Get:17 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Get:18 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic InRelease [15.4 kB]
Get:19 http://security.ubuntu.com/ubuntu bionic-security/restricted amd64 Packages [9,585 B]
Get:20 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic/main amd64 Packages [31.7 kB]
Get:21 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [769 kB]
Get:22 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]
Get:23 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main Sources [1,686 kB]
Get:24 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [959 kB]
Get:25 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [662 kB]
Get:26 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main amd64 Packages [810 kB]
Get:27 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [1,288 kB]
Get:28 http://security.ubuntu.com/ubuntu bionic-security/multiverse amd64 Packages [5,230 B]
Get:29 http://archive.ubuntu.com/ubuntu bionic-updates/multiverse amd64 Packages [8,284 B]
Get:30 http://archive.ubuntu.com/ubuntu bionic-updates/restricted amd64 Packages [20.3 kB]
Get:31 http://archive.ubuntu.com/ubuntu bionic-backports/universe amd64 Packages [4,227 B]
Fetched 6,755 kB in 9s (735 kB/s)
Reading package lists... Done
Building dependency tree       
Reading state information... Done
107 packages can be upgraded. Run 'apt list --upgradable' to see them.

安裝TensorFlow Serving

這裏採用apt的安裝方式：

apt-get install tensorflow-model-server

安裝過程信息：

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  tensorflow-model-server
0 upgraded, 1 newly installed, 0 to remove and 107 not upgraded.
Need to get 151 MB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 tensorflow-model-server all 1.14.0 [151 MB]
Fetched 151 MB in 3s (46.3 MB/s)
Selecting previously unselected package tensorflow-model-server.
(Reading database ... 131183 files and directories currently installed.)
Preparing to unpack .../tensorflow-model-server_1.14.0_all.deb ...
Unpacking tensorflow-model-server (1.14.0) ...
Setting up tensorflow-model-server (1.14.0) ...

啓動TensorFlow Serving

以下采用REST方式啓動(另一種是gRPC)。

%%bash --bg 
nohup tensorflow_model_server \
  --rest_api_port=8501 \
  --model_name=fashion_model \
  --model_base_path="${MODEL_DIR}" >server.log 2>&1

其中bg是將進程搬到後臺運行（Background）

查看日誌信息：tail server.log
日誌信息詳情：

2019-09-19 02:18:22.039966: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2019-09-19 02:18:22.041055: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-09-19 02:18:22.054399: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:202] Restoring SavedModel bundle.
2019-09-19 02:18:22.066228: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:311] SavedModel load for tags { serve }; Status: success. Took 27522 microseconds.
2019-09-19 02:18:22.066280: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:103] No warmup data file found at /tmp/1/assets.extra/tf_serving_warmup_requests
2019-09-19 02:18:22.066354: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: fashion_model version: 1}
2019-09-19 02:18:22.067453: I tensorflow_serving/model_servers/server.cc:324] Running gRPC ModelServer at 0.0.0.0:8500 ...
[warn] getaddrinfo: address family for nodename not supported
2019-09-19 02:18:22.068189: I tensorflow_serving/model_servers/server.cc:344] Exporting HTTP/REST API at:localhost:8501 ...
[evhttp_server.cc : 239] RAW: Entering the event loop ...

構建請求

(1)查看待識別的數據：

def show(idx, title):
  plt.figure()
  plt.imshow(test_images[idx].reshape(28,28))
  plt.axis('off')
  plt.title('\n\n{}'.format(title), fontdict={'size': 16})

import random
rando = random.randint(0,len(test_images)-1)
show(rando, 'An Example Image: {}'.format(class_names[test_labels[rando]]))

結果如下：

(2)數據封裝
將上述圖像進行封裝，下述封裝3張圖像：

import json
data = json.dumps({"signature_name": "serving_default", "instances": test_images[0:3].tolist()})
print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))

運行結果：

Data: {"instances": [[[[0.0], [0.0], [0.0], [0.0], [0.0] ... 0.0], [0.0]]]], "signature_name": "serving_default"}

(3)創建REST請求
如果沒有安裝requests，則先安裝pip install -q requests。
以POST方式向服務方的REST端發送請求。如果沒有指定特殊的可服務版本，默認向最新的版本請求。

import requests
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']

show(0, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
  class_names[np.argmax(predictions[0])], test_labels[0], class_names[np.argmax(predictions[0])], test_labels[0]))

注意：這裏URI中的v1是指版本號，由於本文的模型其實只有一個版本，所以就只能是v1，當存在其他版本模型時候是可以直接修改的。
URI中的models是固定的，fashion_model是之前啓動tensorflow_model_server時候指定--model_name參數。
返回結果：

特定版本的服務

讓我們指定servable的一個特定版本。由於只有一個版本，我們選擇version 1。我們還將查看所有這三個結果。

headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model/versions/1:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']

for i in range(0,3):
  show(i, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
    class_names[np.argmax(predictions[i])], test_labels[i], class_names[np.argmax(predictions[i])], test_labels[i]))

運行結果：

TensorFlow服務部署-以圖像分類爲例

背景

模型訓練

導入數據

訓練模型

保存模型

檢查保存的模型

使用TensorFlow Serving部署服務

安裝tensorflow-model-server

安裝TensorFlow Serving

啓動TensorFlow Serving

構建請求

特定版本的服務

ziw2pdf

apisix~helm方式的部署到k8s

firmeye - IoT固件漏洞挖掘工具

Rasa教程系列-0-Rasa安裝和項目創建

Rasa教程系列-Core-4-Actions

Rasa教程系列-Core-5-Policies

多個neo4j服務共用同一個data目錄

使用BERT對句子進行向量化(TensorFlow版和Pytorch版)

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結