本文是 tf.keras 系列文章的第十篇。介绍了使用 Keras 保存和序列化模型的方法。

文章目录

代码环境：

python version: 3.7.6
tensorflow version: 2.1.0

导入必要的包：

import numpy as np
import tensorflow as tf
from tensorflow import keras

注：本文所有代码在 jupyter notebook编写并测试通过。

Keras模型包含多个组件：

一种体系结构或配置，它指定模型包含的层以及如何连接。
一组权重值（“模型状态”）。
优化器（通过编译模型定义）。
一组损失和指标（通过编译模型或调用add_loss()或定义add_metric()）。

Keras API使得可以将这些片段一次保存到磁盘，或者仅选择性地保存其中一些：

将所有内容以TensorFlow SavedModel格式（或更旧的Keras H5格式）保存到单个存档中。这是标准做法。
仅保存架构/配置，通常保存为JSON文件。
仅保存权重值。通常在训练模型时使用。

保存Keras模型：

model = ...  # Get model (Sequential, Functional Model, or Model subclass)
model.save('path/to/location')

重新加载模型：

from tensorflow import keras
model = keras.models.load_model('path/to/location')

1. 保存和加载整个模型

将整个模型保存到单个工件中。包括：

模型的架构/配置
模型的权重值
模型的编译信息
优化器及其状态（以便在中断的位置重新开始训练）

常用API:

model.save()
tf.keras.models.save_model()
tf.keras.load_model()

有两种方式可以将整个模型保存到磁盘：

TensorFlow SavedModel format （默认方式，官方文档推荐）
Keras H5 format（指定文件后缀名为’.h5’）

1.1 TensorFlow SaveModel 格式

def get_model():
    # 创建一个简单的模型
    inputs = keras.Input(shape=(32,))
    outputs = keras.layers.Dense(1)(inputs)
    model = keras.Model(inputs, outputs)
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

model = get_model()

# 训练模型
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)

# 不加后缀名，默认创建 Tensorflow SavedModel 格式，创建一个名称为my_model的文件夹保存
model.save('my_model')

# 加载保存的模型
reconstructed_model = keras.models.load_model('my_model')

查看保存的模型：

! dir my_model # windows系统使用这个命令
# ! ls my_model # linux 系统使用这个命令

输出：

 驱动器 D 中的卷是 本地磁盘
 卷的序列号是 A094-EFBB

 D:\01 TF.Keras Tutorial\my_model 的目录

2020/04/30  15:36    <DIR>          .
2020/04/30  15:36    <DIR>          ..
2020/04/30  15:36    <DIR>          assets
2020/04/30  15:36            42,168 saved_model.pb
2020/04/30  15:36    <DIR>          variables
               1 个文件         42,168 字节
               4 个目录 223,605,207,040 可用字节

saved_model.pb 中保存的是模型架构，训练配置（包括optimizer，loss和metrics）；权重保存在 variables/文件夹下。

1.2 Keras H5 格式

Keras还支持保存单个HDF5文件，其中包含模型的体系结构，权重值和compile()信息。

model = get_model()

test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)

model.save('my_h5_model.h5')

1.3 两种方法的比较

与 SavedModel 格式相比，H5 格式的文件缺少以下两点：

不包括通过 model.add_loss() 和 model.add_metric() 添加的额外损失和指标。如果模型有这样的损失和指标，并且想要恢复训练，则需要在加载模型后重新添加这些损失。注意：这不适用于通过self.add_loss() 和 self.add_metric() 在图层内部创建的损失或指标。只要该层被加载，这些损耗和度量就被保留，因为它们是该层的调用方法的一部分。
自定义对象（如自定义图层）的计算图不包含在保存的文件中。在加载时，Keras需要访问这些对象的Python类/函数以重建模型。

2. 保存模型架构

模型的配置（或架构）指定模型包含的层以及这些层的连接方式。如果具有模型的配置，则可以使用权重的新初始化状态创建模型，而无需编译信息。

注意：这仅适用于使用Function API或Sequential API定义的模型，不适用于子类模型。

2.1 Function API或Sequential API定义的模型配置

API：

get_config() 和 from_config()
tf.keras.models.model_to_json() 和 tf.keras.models.model_from_json()

get_config() 和 from_config()
调用 config = model.get_config() 将返回一个包含模型配置的Python字典。然后可以通过Sequential.from_config(config)（对于Sequential模型）或 Model.from_config(config)（对于功能API模型）重建相同的模型。

相同的工作流程也适用于任何可序列化层。

1.layer 示例：

layer = keras.layers.Dense(3, activation='relu')
layer_config = layer.get_config()
new_layer = keras.layers.Dense.from_config(layer_config)

2.Sequential模型示例：

model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])
config = model.get_config()
new_model = keras.Sequential.from_config(config)

3.Function模型示例

inputs = keras.Input((32,))
outputs = keras.layers.Dense(1)(inputs)
model = keras.Model(inputs, outputs)
config = model.get_config()
new_model = keras.Model.from_config(config)

API：

to_json() 和 tf.keras.models.model_from_json()

这类似于 get_config / from_config，不同之处在于它将模型转换为JSON字符串，然后可以在不使用原始模型类的情况下进行加载。它也特定于模型，并不适用于图层。

model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])
json_config = model.to_json()
new_model = keras.models.model_from_json(json_config)

2.2 自定义对象

__init__ 方法中定义了子类化模型和层的体系结构call。它们被视为Python字节码，无法将其序列化为JSON兼容的config。

为了保存/加载带有自定义图层的模型或子类模型，应该覆盖 get_config 和 from_config 方法（可选）。另外，应该注册自定义对象，Keras调用。

可以尝试序列化字节码（例如通过pickle），但这是完全不安全的，因为模型无法加载到其它系统上。

自定义函数：
自定义函数（例如，激活损失或初始化）不需要get_config方法。只要将函数名称注册为自定义对象，就可以加载该函数。

1.定义配置方法：

get_config 应该返回一个JSON可序列化的字典，以便与Keras节省架构和模型的API兼容。
from_config(config)（classmethod）应该返回从配置中创建的新图层或模型对象。默认实现返回cls(**config)。

例：

class CustomLayer(keras.layers.Layer):
    def __init__(self, a):
        self.var = tf.Variable(a, name='var_a')
    def call(self, inputs, training=False):
        if training:
            return inputs * self.var
        else:
            return inputs
    
    def get_config(self):
        return {'a': self.var.numpy()}

    # There's actually no need to define `from_config` here, since returning
    # `cls(**config)` is the default behavior.
    @classmethod
    def from_config(cls, config):
        return cls(**config)

layer = CustomLayer(5)
layer.var.assign(2)

serialized_layer = keras.layers.serialize(layer)
new_layer = keras.layers.deserialize(serialized_layer, custom_objects={'CustomLayer': CustomLayer})

2.注册自定义对象
Keras记录了哪个类生成了配置。从上面的示例中，tf.keras.layers.serialize 生成自定义层的序列化形式：

{'class_name': 'CustomLayer', 'config': {'a': 2} }

Keras保留所有内置层，模型，优化器和度量标准类的列表，该列表用于查找正确类去调用 from_config。如果找不到该类，则会引发错误（Value Error: Unknown layer）。有几种方法可以将自定义类注册到此列表中：

custom_objects 在加载函数中设置参数。
tf.keras.utils.custom_object_scope 或 tf.keras.utils.CustomObjectScope
tf.keras.utils.register_keras_serializable

自定义层和函数示例：

class CustomLayer(keras.layers.Layer):
    def __init__(self, units=32, **kwargs):
        super(CustomLayer, self).__init__(**kwargs)
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer="random_normal",
            trainable=True,
        )
        self.b = self.add_weight(
            shape=(self.units,), initializer="random_normal", trainable=True
        )

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

    def get_config(self):
        config = super(CustomLayer, self).get_config()
        config.update({"units": self.units})
        return config

def custom_activation(x):
  return tf.nn.tanh(x) ** 2


# Make a model with the CustomLayer and custom_activation
inputs = keras.Input((32,))
x = CustomLayer(32)(inputs)
outputs = keras.layers.Activation(custom_activation)(x)
model = keras.Model(inputs, outputs)

# Retrieve the config
config = model.get_config()

# At loading time, register the custom objects with a `custom_object_scope`:
custom_objects = {'CustomLayer': CustomLayer,
                  'custom_activation': custom_activation}
with keras.utils.custom_object_scope(custom_objects):
	new_model = keras.Model.from_config(config)

2.3 内存中克隆模型

可以通过 tf.keras.models.clone_model() 从内存中克隆模型。这相当于获取配置，然后从其配置中重新创建模型（因此它不保留编译信息或图层权重值）。

with keras.utils.custom_object_scope(custom_objects):
	new_model = keras.models.clone_model(model)

3. 模型权重的保存和加载

可以选择仅保存和加载模型的权重。这在以下情况下可能有用：

只需要模型进行推断：在这种情况下，无需重新开始训练，因此不需要编译信息或优化器状态。
正在进行迁移学习：在这种情况下，使用现有模型的状态来训练新模型，因此不需要先前模型的编译信息。

3.1 内存中权重传递API

可以使用 get_weights 和在不同对象之间复制权重 set_weights：

tf.keras.layers.Layer.get_weights()：返回numpy数组的列表。
tf.keras.layers.Layer.set_weights()：将模型权重设置为weights参数中的值。

1.在内存中将权重从一层赋给另一层

def create_layer():
    layer = keras.layers.Dense(64, activation='relu', name='dense_2')
    layer.build((None, 784))
    return layer

layer_1 = create_layer()
layer_2 = create_layer()

# 将第一层的权重赋给第二层
layer_2.set_weights(layer_1.get_weights())

2.在内存中将权重从一个模型赋给具有兼容架构的另一个模型

# Create a simple functional model
inputs = keras.Input(shape=(784,), name='digits')
x = keras.layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = keras.layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = keras.layers.Dense(10, name='predictions')(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name='3_layer_mlp')

# Define a subclassed model with the same architecture
class SubclassedModel(keras.Model):
    def __init__(self, output_dim, name=None):
        super(SubclassedModel, self).__init__(name=name)
        self.output_dim = output_dim
        self.dense_1 = keras.layers.Dense(64, activation='relu', name='dense_1')
        self.dense_2 = keras.layers.Dense(64, activation='relu', name='dense_2')
        self.dense_3 = keras.layers.Dense(output_dim, name='predictions')

    def call(self, inputs):
        x = self.dense_1(inputs)
        x = self.dense_2(x)
        x = self.dense_3(x)
        return x
    
    def get_config(self):
        return {'output_dim': self.output_dim, 'name': self.name}

subclassed_model = SubclassedModel(10)
# Call the subclassed model once to create the weights.
subclassed_model(tf.ones((1, 784)))

# Copy weights from functional_model to subclassed_model.
subclassed_model.set_weights(functional_model.get_weights())

assert len(functional_model.weights) == len(subclassed_model.weights)
for a, b in zip(functional_model.weights, subclassed_model.weights):
    np.testing.assert_allclose(a.numpy(), b.numpy())

3.无状态层的处理
因为无状态层不会更改权重的顺序或数量，所以即使存在额外的/缺少的无状态层，模型也可以具有兼容的体系结构。

inputs = keras.Input(shape=(784,), name='digits')
x = keras.layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = keras.layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = keras.layers.Dense(10, name='predictions')(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name='3_layer_mlp')

inputs = keras.Input(shape=(784,), name='digits')
x = keras.layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = keras.layers.Dense(64, activation='relu', name='dense_2')(x)

# Add a dropout layer, which does not contain any weights.
x = keras.layers.Dropout(.5)(x)
outputs = keras.layers.Dense(10, name='predictions')(x)
functional_model_with_dropout = keras.Model(inputs=inputs, outputs=outputs, name='3_layer_mlp')

functional_model_with_dropout.set_weights(functional_model.get_weights())

3.2 权重保存加载API

可以通过调用 model.save_weights 将权重保存到磁盘：

TensorFlow checkpoint （检查点文件）
HDF5 （hdf5格式）

model.save_weights 保存权重的默认格式为 TensorFlow checkpoint。有两种指定保存格式的方法：

save_format 参数：将值设置为 save_format="tf" 或 save_format="h5"。
path 参数：如果路径以 .h5 或结尾 .hdf5 ，则使用 HDF5 格式。其他后缀将保存为TensorFlow checkpoint。

还可以选择将权重作为内存中的numpy数组进行检索。每个API都有其优缺点，下面将详细介绍。

3.2.1 TF Checkpoint 格式

sequential_model = keras.Sequential(
    [keras.Input(shape=(784,), name='digits'),
     keras.layers.Dense(64, activation='relu', name='dense_1'), 
     keras.layers.Dense(64, activation='relu', name='dense_2'),
     keras.layers.Dense(10, name='predictions')])

sequential_model.save_weights('ckpt')
load_status = sequential_model.load_weights('ckpt')

# assert_consumed 可以用作验证是否已从检查点恢复所有变量值。有关Status对象中的其他方法，可参考 tf.train.Checkpoint.restore。
load_status.assert_consumed()

迁移学习的例子：
本质上，只要两个模型具有相同的结构，它们就可以共享相同的检查点。

inputs = keras.Input(shape=(784,), name='digits')
x = keras.layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = keras.layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = keras.layers.Dense(10, name='predictions')(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name='3_layer_mlp')

# Extract a portion of the functional model defined in the Setup section.
# The following lines produce a new model that excludes the final output
# layer of the functional model.
pretrained = keras.Model(functional_model.inputs, 
                            functional_model.layers[-1].input,
                            name='pretrained_model')
# Randomly assign "trained" weights.
for w in pretrained.weights:
    w.assign(tf.random.normal(w.shape))
pretrained.save_weights('pretrained_ckpt')
pretrained.summary()

# Assume this is a separate program where only 'pretrained_ckpt' exists.
# Create a new functional model with a different output dimension.
inputs = keras.Input(shape=(784,), name='digits')
x = keras.layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = keras.layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = keras.layers.Dense(5, name='predictions')(x)
model = keras.Model(inputs=inputs, outputs=outputs, name='new_model')

# Load the weights from pretrained_ckpt into model. 
model.load_weights('pretrained_ckpt')

# Check that all of the pretrained weights have been loaded.
for a, b in zip(pretrained.weights, model.weights):
    np.testing.assert_allclose(a.numpy(), b.numpy())

print('\n','-'*50)
model.summary()

建议使用相同的API来构建模型。如果在“Sequential”和“function”或“function and subclass”之间切换，应该重建预训练模型并将预训练权重加载到该模型。

如果模型架构完全不同，如何将权重保存并加载到不同的模型中？解决方案是使用tf.train.Checkpoint 保存和还原确切的图层/变量。

3.2.2 HDF5格式

HDF5格式包含按图层名称分组的权重。权重是通过将可训练权重列表与不可训练权重列表（与layer.weights）连接起来而排序的列表。因此，如果模型具有与保存在检查点中相同的图层和可训练状态，则可以使用hdf5检查点。

sequential_model = keras.Sequential(
    [keras.Input(shape=(784,), name='digits'),
     keras.layers.Dense(64, activation='relu', name='dense_1'), 
     keras.layers.Dense(64, activation='relu', name='dense_2'),
     keras.layers.Dense(10, name='predictions')])
sequential_model.save_weights('weights.h5')
sequential_model.load_weights('weights.h5')

注意，当模型包含嵌套图层时，更改 layer.trainable 可能会导致 layer.weights 顺序不同。

class NestedDenseLayer(keras.layers.Layer):
    def __init__(self, units, name=None):
        super(NestedDenseLayer, self).__init__(name=name)
        self.dense_1 = keras.layers.Dense(units, name='dense_1')
        self.dense_2 = keras.layers.Dense(units, name='dense_2')

    def call(self, inputs):
        return self.dense_2(self.dense_1(inputs))

nested_model = keras.Sequential([keras.Input((784,)), NestedDenseLayer(10, 'nested')])
variable_names = [v.name for v in nested_model.weights]
print('variables: {}'.format(variable_names))

print('\nChanging trainable status of one of the nested layers...')
nested_model.get_layer('nested').dense_1.trainable = False

variable_names_2 = [v.name for v in nested_model.weights]
print('\nvariables: {}'.format(variable_names_2))
print('variable ordering changed:', variable_names != variable_names_2)

输出：

variables: ['nested/dense_1/kernel:0', 'nested/dense_1/bias:0', 'nested/dense_2/kernel:0', 'nested/dense_2/bias:0']

Changing trainable status of one of the nested layers...

variables: ['nested/dense_2/kernel:0', 'nested/dense_2/bias:0', 'nested/dense_1/kernel:0', 'nested/dense_1/bias:0']
variable ordering changed: True

迁移学习的例子：

def create_functional_model():
    inputs = keras.Input(shape=(784,), name='digits')
    x = keras.layers.Dense(64, activation='relu', name='dense_1')(inputs)
    x = keras.layers.Dense(64, activation='relu', name='dense_2')(x)
    outputs = keras.layers.Dense(10, name='predictions')(x)
    return keras.Model(inputs=inputs, outputs=outputs, name='3_layer_mlp')

functional_model = create_functional_model()  
functional_model.save_weights('pretrained_weights.h5')

pretrained_model = create_functional_model()
pretrained_model.load_weights('pretrained_weights.h5')

extracted_layers = pretrained_model.layers[:-1]
extracted_layers.append(keras.layers.Dense(5, name='dense_3'))
model = keras.Sequential(extracted_layers)
model.summary()

参考：https://www.tensorflow.org/guide/keras/save_and_serialize#introduction

【tf.keras】10: 使用 Keras 保存和加载模型

文章目录

1. 保存和加载整个模型

1.1 TensorFlow SaveModel 格式

1.2 Keras H5 格式

1.3 两种方法的比较

2. 保存模型架构

2.1 Function API或Sequential API定义的模型配置

2.2 自定义对象

2.3 内存中克隆模型

3. 模型权重的保存和加载

3.1 内存中权重传递API

3.2 权重保存加载API

3.2.1 TF Checkpoint 格式

3.2.2 HDF5格式

【CV12】如何在Keras使用 Mask R-CNN 進行目標檢測

【CV13】如何在Keras中使用 YOLO v3 進行目標檢測

【CV10】經典CNN模型中圖像數據增強方法簡介

【CV09】如何可視化CNN中的卷積核和特徵圖

【CV11】如何從頭開發於CIFAR-10圖像分類的CNN

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結