TensorFlow 篇 | TensorFlow 2.x 基于 Keras 模型的本地训练与评估

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/2e/2e4058b36141b9f93901b5367662fbbe.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"「"},{"type":"text","marks":[{"type":"strong"}],"text":"导语"},{"type":"text","text":"」模型的训练与评估是整个机器学习任务流程的核心环节。只有掌握了正确的训练与评估方法,并灵活使用,才能使我们更加快速地进行实验分析与验证,从而对模型有更加深刻的理解。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"前言"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上一篇 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 模型构建的文章中,我们介绍了在 "},{"type":"codeinline","content":[{"type":"text","text":"TensorFlow 2.x"}]},{"type":"text","text":" 版本中使用 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 构建模型的三种方法,那么本篇将在上一篇的基础上着重介绍使用 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 模型进行本地训练、评估以及预测的流程和方法。 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 模型有两种训练评估的方式,一种方式是使用模型内置 "},{"type":"codeinline","content":[{"type":"text","text":"API"}]},{"type":"text","text":" ,如 "},{"type":"codeinline","content":[{"type":"text","text":"model.fit()"}]},{"type":"text","text":" , "},{"type":"codeinline","content":[{"type":"text","text":"model.evaluate()"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"model.predict()"}]},{"type":"text","text":" 等分别执行不同的操作;另一种方式是利用即时执行策略 ("},{"type":"codeinline","content":[{"type":"text","text":"eager execution"}]},{"type":"text","text":") 以及 "},{"type":"codeinline","content":[{"type":"text","text":"GradientTape"}]},{"type":"text","text":" 对象自定义训练和评估流程。对所有 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 模型来说这两种方式都是按照相同的原理来工作的,没有本质上的区别。在一般情况下,我们更愿意使用第一种训练评估方式,因为它更为简单,更易于使用,而在一些特殊的情况下,我们可能会考虑使用自定义的方式来完成训练与评估。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"内置 API 进行训练评估"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"端到端完整示例"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面介绍使用模型内置 "},{"type":"codeinline","content":[{"type":"text","text":"API"}]},{"type":"text","text":" 实现的一个端到端的训练评估示例,可以认为要使用该模型去解决一个多分类问题。这里使用了"},{"type":"codeinline","content":[{"type":"text","text":"函数式 API"}]},{"type":"text","text":" 来构建 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 模型,当然也可以使用 "},{"type":"codeinline","content":[{"type":"text","text":"Sequential"}]},{"type":"text","text":" 方式以及子类化方式去定义模型。示例代码如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"import tensorflow as tf\nfrom tensorflow import keras\nfrom tensorflow.keras import layers\nimport numpy as np\n\n# Train and Test data from numpy array.\nx_train, y_train = (\n np.random.random((60000, 784)),\n np.random.randint(10, size=(60000, 1)),\n)\nx_test, y_test = (\n np.random.random((10000, 784)),\n np.random.randint(10, size=(10000, 1)),\n)\n\n# Reserve 10,000 samples for validation.\nx_val = x_train[-10000:]\ny_val = y_train[-10000:]\nx_train = x_train[:-10000]\ny_train = y_train[:-10000]\n\n# Model Create\ninputs = keras.Input(shape=(784, ), name='digits')\nx = layers.Dense(64, activation='relu', name='dense_1')(inputs)\nx = layers.Dense(64, activation='relu', name='dense_2')(x)\noutputs = layers.Dense(10, name='predictions')(x)\nmodel = keras.Model(inputs=inputs, outputs=outputs)\n\n# Model Compile.\nmodel.compile(\n # Optimizer\n optimizer=keras.optimizers.RMSprop(),\n # Loss function to minimize\n loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n # List of metrics to monitor\n metrics=['sparse_categorical_accuracy'],\n)\n\n# Model Training.\nprint('# Fit model on training data')\nhistory = model.fit(\n x_train,\n y_train,\n batch_size=64,\n epochs=3,\n # We pass some validation for monitoring validation loss and metrics\n # at the end of each epoch\n validation_data=(x_val, y_val),\n)\nprint('\\nhistory dict:', history.history)\n\n# Model Evaluate.\nprint('\\n# Evaluate on test data')\nresults = model.evaluate(x_test, y_test, batch_size=128)\nprint('test loss, test acc:', results)\n\n# Generate predictions (probabilities -- the output of the last layer)\n# Model Predict.\nprint('\\n# Generate predictions for 3 samples')\npredictions = model.predict(x_test[:3])\nprint('predictions shape:', predictions.shape)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"从代码中可以看到,要完成模型的训练与评估的整体流程,首先要构建好模型;然后要对模型进行编译 ("},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":"),目的是指定模型训练过程中需要用到的优化器 ("},{"type":"codeinline","content":[{"type":"text","text":"optimizer"}]},{"type":"text","text":"),损失函数 ("},{"type":"codeinline","content":[{"type":"text","text":"losses"}]},{"type":"text","text":") 以及评估指标 ("},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":") ;接着开始进行模型的训练与交叉验证 ("},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":"),此步骤需要提前指定好训练数据和验证数据,并设置好一些参数如 "},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":" 等才能继续,交叉验证操作会在每轮 ("},{"type":"codeinline","content":[{"type":"text","text":"epoch"}]},{"type":"text","text":") 训练结束后自动触发;最后是模型评估 ("},{"type":"codeinline","content":[{"type":"text","text":"evaluate"}]},{"type":"text","text":") 与预测 ("},{"type":"codeinline","content":[{"type":"text","text":"predict"}]},{"type":"text","text":"),我们会根据评估与预测结果来判断模型的好坏。这样一个完整的模型训练与评估流程就完成了,下面来对示例里的一些实现细节进行展开讲解。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"模型编译 (compile)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"1","normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"在模型训练之前首先要进行模型编译,因为只有知道了要优化什么目标,如何进行优化以及要关注什么指标,模型才能被正确的训练与调整。 "},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":" 方法包含三个主要参数,一个是待优化的损失 ("},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":") ,它指明了要优化的目标,一个是优化器 ("},{"type":"codeinline","content":[{"type":"text","text":"optimizer"}]},{"type":"text","text":"),它指明了目标优化的方向,还有一个可选的指标项 ("},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":"),它指明了训练过程中要关注的模型指标。 "},{"type":"codeinline","content":[{"type":"text","text":"Keras API"}]},{"type":"text","text":" 中已经包含了许多内置的损失函数,优化器以及指标,可以拿来即用,能够满足大多数的训练需要。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"2","normalizeStart":"2"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"损失函数类主要在 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.losses"}]},{"type":"text","text":" 模块下,其中包含了多种预定义的损失,比如我们常用的二分类损失 "},{"type":"codeinline","content":[{"type":"text","text":"BinaryCrossentropy"}]},{"type":"text","text":" ,多分类损失 "},{"type":"codeinline","content":[{"type":"text","text":"CategoricalCrossentropy"}]},{"type":"text","text":" 以及均方根损失 "},{"type":"codeinline","content":[{"type":"text","text":"MeanSquaredError"}]},{"type":"text","text":" 等。传递给 "},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":" 的参数既可以是一个字符串如 "},{"type":"codeinline","content":[{"type":"text","text":"binary_crossentropy"}]},{"type":"text","text":" 也可以是对应的 "},{"type":"codeinline","content":[{"type":"text","text":"losses"}]},{"type":"text","text":" 实例如 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.losses.BinaryCrossentropy()"}]},{"type":"text","text":" ,当我们需要设置损失函数的一些参数时(比如上例中 "},{"type":"codeinline","content":[{"type":"text","text":"from_logits=True"}]},{"type":"text","text":"),则需要使用实例参数。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"3","normalizeStart":"3"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"优化器类主要在 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.optimizers"}]},{"type":"text","text":" 模块下,一些常用的优化器如 "},{"type":"codeinline","content":[{"type":"text","text":"SGD"}]},{"type":"text","text":" , "},{"type":"codeinline","content":[{"type":"text","text":"Adam"}]},{"type":"text","text":" 以及 "},{"type":"codeinline","content":[{"type":"text","text":"RMSprop"}]},{"type":"text","text":" 等均包含在内。同样它也可以通过字符串或者实例的方式传给 "},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":" 方法,一般我们需要设置的优化器参数主要为学习率 ("},{"type":"codeinline","content":[{"type":"text","text":"learning rate"}]},{"type":"text","text":") ,其他的参数可以参考每个优化器的具体实现来动态设置,或者直接使用其默认值即可。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"4","normalizeStart":"4"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"指标类主要在 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.metrics"}]},{"type":"text","text":" 模块下,二分类里常用的 "},{"type":"codeinline","content":[{"type":"text","text":"AUC"}]},{"type":"text","text":" 指标以及 "},{"type":"codeinline","content":[{"type":"text","text":"lookalike"}]},{"type":"text","text":" 里常用的召回率 ("},{"type":"codeinline","content":[{"type":"text","text":"Recall"}]},{"type":"text","text":") 指标等均有包含。同理,它也可以以字符串或者实例的形式传递给 "},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":" 方法,注意 "},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":" 方法接收的是一个 "},{"type":"codeinline","content":[{"type":"text","text":"metric"}]},{"type":"text","text":" 列表,所以可以传递多个指标信息。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"5","normalizeStart":"5"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"当然如果 "},{"type":"codeinline","content":[{"type":"text","text":"losses"}]},{"type":"text","text":" 模块下的损失或 "},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":" 模块下的指标不满足你的需求,也可以自定义它们的实现。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.1. 对于自定义损失,有两种方式,一种是定义一个损失函数,它接收两个输入参数 "},{"type":"codeinline","content":[{"type":"text","text":"y_true"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"y_pred"}]},{"type":"text","text":" ,然后在函数内部计算损失并返回。代码如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"def basic_loss_function(y_true, y_pred):\n return tf.math.reduce_mean(tf.abs(y_true - y_pred))\n\nmodel.compile(optimizer=keras.optimizers.Adam(), loss=basic_loss_function)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.2. 如果你需要的损失函数不仅仅包含上述两个参数,则可以采用另外一种子类化的方式来实现。定义一个类继承自 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.losses.Loss"}]},{"type":"text","text":" 类,并实现其 "},{"type":"codeinline","content":[{"type":"text","text":"__init__(self)"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"call(self, y"},{"type":"text","marks":[{"type":"italic"}],"text":"true, y"},{"type":"text","text":"pred)"}]},{"type":"text","text":" 方法,这种实现方式与子类化层和模型比较相似。比如要实现一个加权的二分类交叉熵损失,其代码如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"class WeightedBinaryCrossEntropy(keras.losses.Loss):\n \"\"\"\n Args:\n pos_weight: Scalar to affect the positive labels of the loss function.\n weight: Scalar to affect the entirety of the loss function.\n from_logits: Whether to compute loss from logits or the probability.\n reduction: Type of tf.keras.losses.Reduction to apply to loss.\n name: Name of the loss function.\n \"\"\"\n def __init__(self,\n pos_weight,\n weight,\n from_logits=False,\n reduction=keras.losses.Reduction.AUTO,\n name='weighted_binary_crossentropy'):\n super().__init__(reduction=reduction, name=name)\n self.pos_weight = pos_weight\n self.weight = weight\n self.from_logits = from_logits\n\n def call(self, y_true, y_pred):\n ce = tf.losses.binary_crossentropy(\n y_true,\n y_pred,\n from_logits=self.from_logits,\n )[:, None]\n ce = self.weight * (ce * (1 - y_true) + self.pos_weight * ce * y_true)\n return ce\n\nmodel.compile(\n optimizer=keras.optimizers.Adam(),\n loss=WeightedBinaryCrossEntropy(\n pos_weight=0.5,\n weight=2,\n from_logits=True,\n ),\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.3. 对于自定义指标,也可以通过子类化的方式来实现,首先定义一个指标类继承自 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.metrics.Metric"}]},{"type":"text","text":" 类并实现其四个方法,分别是 "},{"type":"codeinline","content":[{"type":"text","text":"__init__(self)"}]},{"type":"text","text":" 方法,用来创建状态变量, "},{"type":"codeinline","content":[{"type":"text","text":"update_state(self, y_true, y_pred, sample_weight=None)"}]},{"type":"text","text":" 方法,用来更新状态变量, "},{"type":"codeinline","content":[{"type":"text","text":"result(self)"}]},{"type":"text","text":" 方法,用来返回状态变量的最终结果, 以及 "},{"type":"codeinline","content":[{"type":"text","text":"reset_states(self)"}]},{"type":"text","text":" 方法,用来重新初始化状态变量。比如要实现一个多分类中"},{"type":"codeinline","content":[{"type":"text","text":"真正例 (True Positives) 数量"}]},{"type":"text","text":"的统计指标,其代码如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"class CategoricalTruePositives(keras.metrics.Metric):\n def __init__(self, name='categorical_true_positives', **kwargs):\n super().__init__(name=name, **kwargs)\n self.true_positives = self.add_weight(name='tp', initializer='zeros')\n\n def update_state(self, y_true, y_pred, sample_weight=None):\n y_pred = tf.reshape(tf.argmax(y_pred, axis=1), shape=(-1, 1))\n values = tf.cast(y_true, 'int32') == tf.cast(y_pred, 'int32')\n values = tf.cast(values, 'float32')\n if sample_weight is not None:\n sample_weight = tf.cast(sample_weight, 'float32')\n values = tf.multiply(values, sample_weight)\n self.true_positives.assign_add(tf.reduce_sum(values))\n\n def result(self):\n return self.true_positives\n\n def reset_states(self):\n # The state of the metric will be reset at the start of each epoch.\n self.true_positives.assign(0.)\n\nmodel.compile(\n optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),\n loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n metrics=[CategoricalTruePositives()],\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.4. 对于一些在层 ("},{"type":"codeinline","content":[{"type":"text","text":"layers"}]},{"type":"text","text":") 内部定义的损失,可以通过在自定义层的 "},{"type":"codeinline","content":[{"type":"text","text":"call"}]},{"type":"text","text":" 方法里调用 "},{"type":"codeinline","content":[{"type":"text","text":"self.add_loss()"}]},{"type":"text","text":" 来实现,而且在模型训练时,它会自动添加到整体的损失中,不用人为干预。通过对比加入自定义损失前后模型训练输出的 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 值的变化来确认这部分损失是否被加入到了整体的损失中。还可以在 "},{"type":"codeinline","content":[{"type":"text","text":"build"}]},{"type":"text","text":" 模型后,打印 "},{"type":"codeinline","content":[{"type":"text","text":"model.losses"}]},{"type":"text","text":" 来查看该模型的所有损失。注意正则化损失是内置在 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 的所有层中的,只需要在调用层时加入相应正则化参数即可,无需在 "},{"type":"codeinline","content":[{"type":"text","text":"call"}]},{"type":"text","text":" 方法中 "},{"type":"codeinline","content":[{"type":"text","text":"add_loss()"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.5. 对于指标信息来说,可以在自定义层的 "},{"type":"codeinline","content":[{"type":"text","text":"call"}]},{"type":"text","text":" 方法里调用 "},{"type":"codeinline","content":[{"type":"text","text":"self.add_metric()"}]},{"type":"text","text":" 来新增指标,同样的,它也会自动出现在整体的指标中,无需人为干预。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.6. "},{"type":"codeinline","content":[{"type":"text","text":"函数式 API"}]},{"type":"text","text":" 实现的模型,可以通过调用 "},{"type":"codeinline","content":[{"type":"text","text":"model.add_loss()"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"model.add_metric()"}]},{"type":"text","text":" 来实现与自定义模型同样的效果。示例代码如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"import tensorflow as tf\nfrom tensorflow import keras\nfrom tensorflow.keras import layers\n\ninputs = keras.Input(shape=(784, ), name='digits')\nx1 = layers.Dense(64, activation='relu', name='dense_1')(inputs)\nx2 = layers.Dense(64, activation='relu', name='dense_2')(x1)\noutputs = layers.Dense(10, name='predictions')(x2)\nmodel = keras.Model(inputs=inputs, outputs=outputs)\n\nmodel.add_loss(tf.reduce_sum(x1) * 0.1)\n\nmodel.add_metric(\n keras.backend.std(x1),\n name='std_of_activation',\n aggregation='mean',\n)\n\nmodel.compile(\n optimizer=keras.optimizers.RMSprop(1e-3),\n loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n)\nmodel.fit(x_train, y_train, batch_size=64, epochs=1)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"6","normalizeStart":"6"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":6,"align":null,"origin":null},"content":[{"type":"text","text":"如果要编译的是多输入多输出模型,则可以为每一个输出指定不同的损失函数以及不同的指标,后面会详细介绍。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"模型训练与验证 (fit)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"1","normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"模型的训练通过调用 "},{"type":"codeinline","content":[{"type":"text","text":"model.fit()"}]},{"type":"text","text":" 方法来实现, "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法包括训练数据与验证数据参数,它们可以是 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 类型数据,也可以是 "},{"type":"codeinline","content":[{"type":"text","text":"tf.data"}]},{"type":"text","text":" 模块下 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 类型的数据。另外 "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法还包括 "},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":" , "},{"type":"codeinline","content":[{"type":"text","text":"batch_size"}]},{"type":"text","text":" 以及 "},{"type":"codeinline","content":[{"type":"text","text":"steps"},{"type":"text","marks":[{"type":"italic"}],"text":"per"},{"type":"text","text":"epoch"}]},{"type":"text","text":" 等控制训练流程的参数,并且还可以通过 "},{"type":"codeinline","content":[{"type":"text","text":"callbacks"}]},{"type":"text","text":" 参数控制模型在训练过程中执行一些其它的操作,如 "},{"type":"codeinline","content":[{"type":"text","text":"Tensorboard"}]},{"type":"text","text":" 日志记录等。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"2","normalizeStart":"2"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"模型的训练和验证数据可以是 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 类型数据,最开始的端到端示例即是采用 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 数组作为输入。一般在数据量较小且内存能容下的情况下采用 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 数据作为训练和评估的数据输入。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 2.1. 对于 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 类型数据来说,如果指定了 "},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":" 参数,则训练数据的总量为"},{"type":"codeinline","content":[{"type":"text","text":"原始样本数量 * epochs"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 2.2. 默认情况下一轮训练 ("},{"type":"codeinline","content":[{"type":"text","text":"epoch"}]},{"type":"text","text":") 所有的原始样本都会被训练一遍,下一轮训练还会使用这些样本数据进行训练,每一轮执行的步数 ("},{"type":"codeinline","content":[{"type":"text","text":"steps"}]},{"type":"text","text":") 为"},{"type":"codeinline","content":[{"type":"text","text":"原始样本数量/batch_size"}]},{"type":"text","text":" ,如果 "},{"type":"codeinline","content":[{"type":"text","text":"batch_size"}]},{"type":"text","text":" 不指定,默认为 "},{"type":"codeinline","content":[{"type":"text","text":"32"}]},{"type":"text","text":" 。交叉验证在"},{"type":"codeinline","content":[{"type":"text","text":"每一轮训练结束后"}]},{"type":"text","text":"触发,并且也会在所有验证样本上执行一遍,可以指定 "},{"type":"codeinline","content":[{"type":"text","text":"validation"},{"type":"text","marks":[{"type":"italic"}],"text":"batch"},{"type":"text","text":"size"}]},{"type":"text","text":" 来控制验证数据的 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 大小,如果不指定默认同 "},{"type":"codeinline","content":[{"type":"text","text":"batch_size"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 2.3. 对于 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 类型数据来说,如果设置了 "},{"type":"codeinline","content":[{"type":"text","text":"steps_per_epoch"}]},{"type":"text","text":" 参数,表示一轮要训练指定的步数,下一轮会在上轮基础上使用下一个 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 的数据继续进行训练,直到所有的 "},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":" 结束或者"},{"type":"codeinline","content":[{"type":"text","text":"训练数据的总量"}]},{"type":"text","text":"被耗尽。要想训练流程不因数据耗尽而结束,则需要保证"},{"type":"codeinline","content":[{"type":"text","text":"数据的总量"}]},{"type":"text","text":"要大于 "},{"type":"codeinline","content":[{"type":"text","text":"steps"},{"type":"text","marks":[{"type":"italic"}],"text":"per"},{"type":"text","text":"epoch "},{"type":"text","marks":[{"type":"italic"}],"text":" epochs "},{"type":"text","text":" batch_size"}]},{"type":"text","text":"。同理也可以设置 "},{"type":"codeinline","content":[{"type":"text","text":"validation_steps"}]},{"type":"text","text":" ,表示交叉验证所需步数,此时要注意验证集的数据总量要大于 "},{"type":"codeinline","content":[{"type":"text","text":"validation"},{"type":"text","marks":[{"type":"italic"}],"text":"steps * validation"},{"type":"text","text":"batch_size"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 2.4. "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法还提供了另外一个参数 "},{"type":"codeinline","content":[{"type":"text","text":"validation_split"}]},{"type":"text","text":" 来自动从训练数据集中保留一定比例的数据作为验证,该参数取值为 "},{"type":"codeinline","content":[{"type":"text","text":"0-1"}]},{"type":"text","text":" 之间,比如 "},{"type":"codeinline","content":[{"type":"text","text":"0.2"}]},{"type":"text","text":" 代表 "},{"type":"codeinline","content":[{"type":"text","text":"20%"}]},{"type":"text","text":" 的训练集用来做验证, "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法会默认保留 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 数组最后面 "},{"type":"codeinline","content":[{"type":"text","text":"20%"}]},{"type":"text","text":" 的样本作为验证集。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"3","normalizeStart":"3"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"codeinline","content":[{"type":"text","text":"TensorFlow 2.0"}]},{"type":"text","text":" 之后,更为推荐的是使用 "},{"type":"codeinline","content":[{"type":"text","text":"tf.data"}]},{"type":"text","text":" 模块下 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 类型的数据作为训练和验证的数据输入,它能以更加快速以及可扩展的方式加载和预处理数据。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.1. 使用 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 进行训练的代码如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))\n# Shuffle and slice the dataset.\ntrain_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)\n\n# Prepare the validation dataset\nval_dataset = tf.data.Dataset.from_tensor_slices((x_val, y_val))\nval_dataset = val_dataset.batch(64)\n\n# Now we get a test dataset.\ntest_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test))\ntest_dataset = test_dataset.batch(64)\n\n# Since the dataset already takes care of batching,\n# we don't pass a `batch_size` argument.\nmodel.fit(train_dataset, epochs=3, validation_data=val_dataset)\nresult = model.evaluate(test_dataset)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.2. "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 一般是一个二元组,第一个元素为模型的输入特征,如果为多输入就是多个特征的字典 ("},{"type":"codeinline","content":[{"type":"text","text":"dict"}]},{"type":"text","text":") 或元组 ("},{"type":"codeinline","content":[{"type":"text","text":"tuple"}]},{"type":"text","text":"),第二个元素是真实的数据标签 ("},{"type":"codeinline","content":[{"type":"text","text":"label"}]},{"type":"text","text":") ,即 "},{"type":"codeinline","content":[{"type":"text","text":"ground truth"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.3. 使用 "},{"type":"codeinline","content":[{"type":"text","text":"from_tensor_slices"}]},{"type":"text","text":" 方法可以从 "},{"type":"codeinline","content":[{"type":"text","text":"nunpy"}]},{"type":"text","text":" 数组直接生成 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 类型数据,是一种比较方便快捷的生成方式,一般在测试时使用。其它较为常用的生成方式,比如从 "},{"type":"codeinline","content":[{"type":"text","text":"TFRecord"}]},{"type":"text","text":" 文件或文本文件 ("},{"type":"codeinline","content":[{"type":"text","text":"TextLine"}]},{"type":"text","text":") 中生成 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" ,可以参考 "},{"type":"codeinline","content":[{"type":"text","text":"tf.data"}]},{"type":"text","text":" 模块下的相关类的具体实现。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.4. "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 可以调用内置方法提前对数据进行预处理,比如数据打乱 ("},{"type":"codeinline","content":[{"type":"text","text":"shuffle"}]},{"type":"text","text":"), "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 以及 "},{"type":"codeinline","content":[{"type":"text","text":"repeat"}]},{"type":"text","text":" 等操作。"},{"type":"codeinline","content":[{"type":"text","text":"shuffle"}]},{"type":"text","text":" 操作是为了减小模型过拟合的机率,它仅为小范围打乱,需要借助于一个缓存区,先将数据填满,然后在每次训练时从缓存区里随机抽取 "},{"type":"codeinline","content":[{"type":"text","text":"batch_size"}]},{"type":"text","text":" 条数据,产生的空缺用后面的数据填充,从而实现了局部打乱的效果。"},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 是对数据进行分批次,常用于控制和调节模型的训练速度以及训练效果,因为在 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 中已经 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 过,所以 "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法中的 "},{"type":"codeinline","content":[{"type":"text","text":"batch_size"}]},{"type":"text","text":" 就无需再提供了。"},{"type":"codeinline","content":[{"type":"text","text":"repeat"}]},{"type":"text","text":" 用来对数据进行复制,以解决数据量不足的问题,如果指定了其参数 "},{"type":"codeinline","content":[{"type":"text","text":"count"}]},{"type":"text","text":",则表示整个数据集要复制 "},{"type":"codeinline","content":[{"type":"text","text":"count"}]},{"type":"text","text":" 次,不指定就会"},{"type":"codeinline","content":[{"type":"text","text":"无限次复制"}]},{"type":"text","text":" ,此时必须要设置 "},{"type":"codeinline","content":[{"type":"text","text":"steps"},{"type":"text","marks":[{"type":"italic"}],"text":"per"},{"type":"text","text":"epoch"}]},{"type":"text","text":" 参数,不然训练无法终止。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.5. 上述例子中, "},{"type":"codeinline","content":[{"type":"text","text":"train dataset"}]},{"type":"text","text":" 的全部数据在每一轮都会被训练到,因为一轮训练结束后, "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 会重置,然后被用来重新训练。但是当指定了 "},{"type":"codeinline","content":[{"type":"text","text":"steps_per_epoch"}]},{"type":"text","text":" 之后, "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 在每轮训练后不会被重置,一直到所有 "},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":" 结束或所有的训练数据被消耗完之后终止,要想训练正常结束,须保证提供的训练数据总量要大于 "},{"type":"codeinline","content":[{"type":"text","text":"steps_per_epoch "},{"type":"text","marks":[{"type":"italic"}],"text":" epochs "},{"type":"text","text":" batch_size"}]},{"type":"text","text":"。同理也可以指定 "},{"type":"codeinline","content":[{"type":"text","text":"validation_steps"}]},{"type":"text","text":" ,此时数据验证会执行指定的步数,在下次验证开始时, "},{"type":"codeinline","content":[{"type":"text","text":"validation dataset"}]},{"type":"text","text":" 会被重置,以保证每次交叉验证使用的都是相同的数据。"},{"type":"codeinline","content":[{"type":"text","text":"validation_split"}]},{"type":"text","text":" 参数不适用于 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 类型数据,因为它需要知道每个数据样本的索引,这在 "},{"type":"codeinline","content":[{"type":"text","text":"dataset API"}]},{"type":"text","text":" 下很难实现。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.6. 当不指定 "},{"type":"codeinline","content":[{"type":"text","text":"steps_per_epoch"}]},{"type":"text","text":" 参数时, "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 类型数据与 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 类型数据的处理流程完全一致。但当指定之后,要注意它们之间在处理上的差异。对于 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 类型数据而言,在处理时,它会被转为 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 类型数据,只不过这个 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 被 "},{"type":"codeinline","content":[{"type":"text","text":"repeat"}]},{"type":"text","text":" 了 "},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":" 次,而且每轮训练结束后,这个 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 不会被重置,会在上次的 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 之后继续训练。假设原始数据量为 "},{"type":"codeinline","content":[{"type":"text","text":"n"}]},{"type":"text","text":" ,指定 "},{"type":"codeinline","content":[{"type":"text","text":"steps"},{"type":"text","marks":[{"type":"italic"}],"text":"per"},{"type":"text","text":"epoch"}]},{"type":"text","text":" 参数之后,两者的差异主要体现在真实的训练数据量上, "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 为 "},{"type":"codeinline","content":[{"type":"text","text":"n * epochs"}]},{"type":"text","text":" , "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 为 "},{"type":"codeinline","content":[{"type":"text","text":"n"}]},{"type":"text","text":"。具体细节可以参考源码实现。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.7. "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 还有 "},{"type":"codeinline","content":[{"type":"text","text":"map"}]},{"type":"text","text":" 与 "},{"type":"codeinline","content":[{"type":"text","text":"prefetch"}]},{"type":"text","text":" 方法比较实用。 "},{"type":"codeinline","content":[{"type":"text","text":"map"}]},{"type":"text","text":" 方法接收一个"},{"type":"codeinline","content":[{"type":"text","text":"函数"}]},{"type":"text","text":"作为参数,用来对 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 中的每一条数据进行处理并返回一个新的 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" ,比如我们在使用 "},{"type":"codeinline","content":[{"type":"text","text":"TextLineDataset"}]},{"type":"text","text":" 读取文本文件后生成了一个 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" ,而我们要抽取输入数据中的某些列作为特征 ("},{"type":"codeinline","content":[{"type":"text","text":"features"}]},{"type":"text","text":"),某些列作为标签 ("},{"type":"codeinline","content":[{"type":"text","text":"labels"}]},{"type":"text","text":"),此时就会用到 "},{"type":"codeinline","content":[{"type":"text","text":"map"}]},{"type":"text","text":" 方法。"},{"type":"codeinline","content":[{"type":"text","text":"prefetch"}]},{"type":"text","text":" 方法预先从 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 中准备好下次训练所需的数据并放于内存中,这样可以减少每轮训练之间的延迟等待时间。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"4","normalizeStart":"4"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"除了训练数据和验证数据外,还可以向 "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法传递样本权重 ("},{"type":"codeinline","content":[{"type":"text","text":"sample_weight"}]},{"type":"text","text":") 以及类别权重 ("},{"type":"codeinline","content":[{"type":"text","text":"class_weight"}]},{"type":"text","text":") 参数。这两个参数通常被用于处理分类不平衡问题,通过给类别少的样本赋予更高的权重,使得各个类别对整体损失的贡献趋于一致。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 4.1. 对于 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 类型的输入数据,可以使用上述两个参数,以上面的多分类问题为例,如果要给分类 "},{"type":"codeinline","content":[{"type":"text","text":"5"}]},{"type":"text","text":" 一个更高的权重,可以使用如下代码来实现:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"import numpy as np\n\n# Here's the same example using `class_weight`\nclass_weight = {0: 1., 1: 1., 2: 1., 3: 1., 4: 1.,\n # Set weight \"2\" for class \"5\",\n # making this class 2x more important\n 5: 2.,\n 6: 1., 7: 1., 8: 1., 9: 1.}\nprint('Fit with class weight')\nmodel.fit(x_train, y_train, class_weight=class_weight, batch_size=64, epochs=4)\n\n# Here's the same example using `sample_weight` instead:\nsample_weight = np.ones(shape=(len(y_train), ))\nsample_weight[y_train == 5] = 2.\nprint('\\nFit with sample weight')\n\nmodel.fit(\n x_train,\n y_train,\n sample_weight=sample_weight,\n batch_size=64,\n epochs=4,\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 4.2. 而对于 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 类型的输入数据来说,不能直接使用上述两个参数,需要在构建 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 时将 "},{"type":"codeinline","content":[{"type":"text","text":"sample_weight"}]},{"type":"text","text":" 加入其中,返回一个三元组的 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" ,格式为 "},{"type":"codeinline","content":[{"type":"text","text":"(input_batch, target_batch, sample"},{"type":"text","marks":[{"type":"italic"}],"text":"weight"},{"type":"text","text":"batch)"}]},{"type":"text","text":" 。示例代码如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"sample_weight = np.ones(shape=(len(y_train), ))\nsample_weight[y_train == 5] = 2.\n\n# Create a Dataset that includes sample weights\n# (3rd element in the return tuple).\ntrain_dataset = tf.data.Dataset.from_tensor_slices((\n x_train,\n y_train,\n sample_weight,\n))\n\n# Shuffle and slice the dataset.\ntrain_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)\n\nmodel.fit(train_dataset, epochs=3)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"5","normalizeStart":"5"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"在模型的训练过程中有一些特殊时间点,比如在一个 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 结束或者一个 "},{"type":"codeinline","content":[{"type":"text","text":"epoch"}]},{"type":"text","text":" 结束时,一般都会做一些额外的处理操作来辅助我们进行训练,上面介绍过的模型交叉验证就是其中之一。还有一些其它的操作,比如当模型训练停滞不前时 ("},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 值在某一值附近不断波动),自动减小其学习速率 ("},{"type":"codeinline","content":[{"type":"text","text":"learning rate"}]},{"type":"text","text":") 以使损失继续下降,从而得到更好的收敛效果;在训练过程中保存模型的权重信息,以备重启模型时可以在已有权重的基础上继续训练,从而减少训练时间;还有在每轮的训练结束后记录模型的损失 ("},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":") 和指标 ("},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":") 信息,以供 "},{"type":"codeinline","content":[{"type":"text","text":"Tensorboard"}]},{"type":"text","text":" 分析使用等等,这些操作都是模型训练过程中不可或缺的部分。它们都可以通过回调函数 ("},{"type":"codeinline","content":[{"type":"text","text":"callbacks"}]},{"type":"text","text":") 的方式来实现,这些回调函数都在 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.callbacks"}]},{"type":"text","text":" 模块下,可以将它们作为列表参数传递给 "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法以达到不同的操作目的。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.1. 下面以 "},{"type":"codeinline","content":[{"type":"text","text":"EarlyStopping"}]},{"type":"text","text":" 为例说明 "},{"type":"codeinline","content":[{"type":"text","text":"callbacks"}]},{"type":"text","text":" 的使用方式。本例中,当交叉验证损失 "},{"type":"codeinline","content":[{"type":"text","text":"val_loss"}]},{"type":"text","text":" 至少在 "},{"type":"codeinline","content":[{"type":"text","text":"2"}]},{"type":"text","text":" 轮 ("},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":") 训练中的减少值都低于 "},{"type":"codeinline","content":[{"type":"text","text":"1e-2"}]},{"type":"text","text":" 时,我们会提前停止训练。其示例代码如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"callbacks = [\n keras.callbacks.EarlyStopping(\n # Stop training when `val_loss` is no longer improving\n monitor='val_loss',\n # \"no longer improving\" being defined as \"no better than 1e-2 less\"\n min_delta=1e-2,\n # \"no longer improving\" being further defined as \"for at least 2 epochs\"\n patience=2,\n verbose=1,\n )\n]\n\nmodel.fit(\n x_train,\n y_train,\n epochs=20,\n batch_size=64,\n callbacks=callbacks,\n validation_split=0.2,\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.2. 一些比较常用的 "},{"type":"codeinline","content":[{"type":"text","text":"callbacks"}]},{"type":"text","text":" 需要了解并掌握, 如 "},{"type":"codeinline","content":[{"type":"text","text":"ModelCheckpoint"}]},{"type":"text","text":" 用来保存模型权重信息, "},{"type":"codeinline","content":[{"type":"text","text":"TensorBoard"}]},{"type":"text","text":" 用来记录一些指标信息, "},{"type":"codeinline","content":[{"type":"text","text":"ReduceLROnPlateau"}]},{"type":"text","text":" 用来在模型停滞时减小学习率。更多的 "},{"type":"codeinline","content":[{"type":"text","text":"callbacks"}]},{"type":"text","text":" 函数可以参考 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.callbacks"}]},{"type":"text","text":" 模块下的实现。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.3. 当然也可以自定义 "},{"type":"codeinline","content":[{"type":"text","text":"callbacks"}]},{"type":"text","text":" 类,该子类需要继承自 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.callbacks.Callback"}]},{"type":"text","text":" 类,并按需实现其内置的方法,比如如果需要在每个 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 训练结束后记录 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 的值,则可以使用如下代码实现:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"class LossHistory(keras.callbacks.Callback):\n def on_train_begin(self, logs):\n self.losses = []\n\n def on_batch_end(self, batch, logs):\n self.losses.append(logs.get('loss'))"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.4. 在 "},{"type":"codeinline","content":[{"type":"text","text":"TensorFlow 2.0"}]},{"type":"text","text":" 之前, "},{"type":"codeinline","content":[{"type":"text","text":"ModelCheckpoint"}]},{"type":"text","text":" 内容和 "},{"type":"codeinline","content":[{"type":"text","text":"TensorBoard"}]},{"type":"text","text":" 内容是同时记录的,保存在相同的文件夹下,而在 "},{"type":"codeinline","content":[{"type":"text","text":"2.0"}]},{"type":"text","text":" 之后的 "},{"type":"codeinline","content":[{"type":"text","text":"keras API"}]},{"type":"text","text":" 中它们可以通过不同的回调函数分开指定。记录的日志文件中,含有 "},{"type":"codeinline","content":[{"type":"text","text":"checkpoint"}]},{"type":"text","text":" 关键字的文件一般为检查点文件,含有 "},{"type":"codeinline","content":[{"type":"text","text":"events.out.tfevents"}]},{"type":"text","text":" 关键字的文件一般为 "},{"type":"codeinline","content":[{"type":"text","text":"Tensorboard"}]},{"type":"text","text":" 相关文件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"多输入输出模型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/70/70d09c1e0c3a300efbbe93340db10261.png","alt":"多输入输出模型图","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"1","normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"考虑如图所示的多输入多输出模型,该模型包括两个输入和两个输出, "},{"type":"codeinline","content":[{"type":"text","text":"score_output"}]},{"type":"text","text":" 输出表示分值, "},{"type":"codeinline","content":[{"type":"text","text":"class_output"}]},{"type":"text","text":" 输出表示分类,其示例代码如下:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"from tensorflow import keras\nfrom tensorflow.keras import layers\n\nimage_input = keras.Input(shape=(32, 32, 3), name='img_input')\ntimeseries_input = keras.Input(shape=(None, 10), name='ts_input')\n\nx1 = layers.Conv2D(3, 3)(image_input)\nx1 = layers.GlobalMaxPooling2D()(x1)\n\nx2 = layers.Conv1D(3, 3)(timeseries_input)\nx2 = layers.GlobalMaxPooling1D()(x2)\n\nx = layers.concatenate([x1, x2])\n\nscore_output = layers.Dense(1, name='score_output')(x)\nclass_output = layers.Dense(5, name='class_output')(x)\n\nmodel = keras.Model(\n inputs=[image_input, timeseries_input],\n outputs=[score_output, class_output],\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"2","normalizeStart":"2"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"在进行模型编译时,如果只指定一个 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 明显不能满足不同输出的损失计算方式,所以此时可以指定 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 为一个列表 ("},{"type":"codeinline","content":[{"type":"text","text":"list"}]},{"type":"text","text":"),其中每个元素分别对应于不同的输出。示例代码如下:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"model.compile(\n optimizer=keras.optimizers.RMSprop(1e-3),\n loss=[\n keras.losses.MeanSquaredError(),\n keras.losses.CategoricalCrossentropy(from_logits=True)\n ],\n loss_weights=[1, 1],\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此时模型的优化目标为所有单个损失值的总和,如果想要为不同的损失指定不同的权重,可以设置 "},{"type":"codeinline","content":[{"type":"text","text":"loss_weights"}]},{"type":"text","text":" 参数,该参数接收一个标量系数列表 ("},{"type":"codeinline","content":[{"type":"text","text":"list"}]},{"type":"text","text":"),用以对模型不同输出的损失值进行加权。如果仅为模型指定一个 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" ,则该 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 会应用到每一个输出,在模型的多个输出损失计算方式相同时,可以采用这种方式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"3","normalizeStart":"3"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"同样的对于模型的指标 ("},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":"),也可以指定为多个,注意因为 "},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":" 参数本身即为一个列表,所以为多个输出指定 "},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":" 应该使用二维列表。示例代码如下:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"model.compile(\n optimizer=keras.optimizers.RMSprop(1e-3),\n loss=[\n keras.losses.MeanSquaredError(),\n keras.losses.CategoricalCrossentropy(from_logits=True),\n ],\n metrics=[\n [\n keras.metrics.MeanAbsolutePercentageError(),\n keras.metrics.MeanAbsoluteError()\n ],\n [keras.metrics.CategoricalAccuracy()],\n ],\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"4","normalizeStart":"4"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"对于有明确名称的输出,可以通过字典的方式来设置其 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":"。示例代码如下:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"model.compile(\n optimizer=keras.optimizers.RMSprop(1e-3),\n loss={\n 'score_output': keras.losses.MeanSquaredError(),\n 'class_output': keras.losses.CategoricalCrossentropy(from_logits=True),\n },\n metrics={\n 'score_output': [\n keras.metrics.MeanAbsolutePercentageError(),\n keras.metrics.MeanAbsoluteError()\n ],\n 'class_output': [\n keras.metrics.CategoricalAccuracy(),\n ]\n },\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"5","normalizeStart":"5"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"对于仅被用来预测的输出,也可以不指定其 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":"。示例代码如下:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"model.compile(\n optimizer=keras.optimizers.RMSprop(1e-3),\n loss=[\n None,\n keras.losses.CategoricalCrossentropy(from_logits=True),\n ],\n)\n\n# Or dict loss version\nmodel.compile(\n optimizer=keras.optimizers.RMSprop(1e-3),\n loss={\n 'class_output': keras.losses.CategoricalCrossentropy(from_logits=True),\n },\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"6","normalizeStart":"6"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":6,"align":null,"origin":null},"content":[{"type":"text","text":"对于多输入输出模型的训练来说,也可以采用和其 "},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":" 方法相同的方式来提供数据输入,也就是说既可以使用列表的方式,也可以使用字典的方式来指定多个输入。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 6.1. "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 类型数据示例代码如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Generate dummy Numpy data\nimg_data = np.random.random_sample(size=(100, 32, 32, 3))\nts_data = np.random.random_sample(size=(100, 20, 10))\nscore_targets = np.random.random_sample(size=(100, 1))\nclass_targets = np.random.random_sample(size=(100, 5))\n\n# Fit on lists\nmodel.fit(\n x=[img_data, ts_data],\n y=[score_targets, class_targets],\n batch_size=32,\n epochs=3,\n)\n\n# Alternatively, fit on dicts\nmodel.fit(\n x={\n 'img_input': img_data,\n 'ts_input': ts_data,\n },\n y={\n 'score_output': score_targets,\n 'class_output': class_targets,\n },\n batch_size=32,\n epochs=3,\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 6.2. "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 类型数据示例代码如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Generate dummy dataset data from numpy\ntrain_dataset = tf.data.Dataset.from_tensor_slices((\n (img_data, ts_data),\n (score_targets, class_targets),\n))\n\n# Alternatively generate with dict\ntrain_dataset = tf.data.Dataset.from_tensor_slices((\n {\n 'img_input': img_data,\n 'ts_input': ts_data,\n },\n {\n 'score_output': score_targets,\n 'class_output': class_targets,\n },\n))\ntrain_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)\n\nmodel.fit(train_dataset, epochs=3)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"自定义训练流程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"1","normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"如果你不想使用 "},{"type":"codeinline","content":[{"type":"text","text":"model"}]},{"type":"text","text":" 内置提供的 "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"evaluate"}]},{"type":"text","text":" 方法,而想使用低阶 "},{"type":"codeinline","content":[{"type":"text","text":"API"}]},{"type":"text","text":" 自定义模型的训练和评估的流程,则可以借助于 "},{"type":"codeinline","content":[{"type":"text","text":"GradientTape"}]},{"type":"text","text":" 来实现。深度神经网络在后向传播过程中需要计算损失 ("},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":") 关于权重矩阵的导数(也称为梯度),以更新权重矩阵并获得最优解,而 "},{"type":"codeinline","content":[{"type":"text","text":"GradientTape"}]},{"type":"text","text":" 能自动提供求导帮助,无需我们手动求导,它本质上是一个"},{"type":"codeinline","content":[{"type":"text","text":"求导记录器"}]},{"type":"text","text":" ,能够记录前项传播的过程,并据此计算导数。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"2","normalizeStart":"2"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"模型的构建过程与之前相比没有什么不同,主要体现在训练的部分,示例代码如下:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"import tensorflow as tf\nfrom tensorflow import keras\nfrom tensorflow.keras import layers\nimport numpy as np\n\n# Get the model.\ninputs = keras.Input(shape=(784, ), name='digits')\nx = layers.Dense(64, activation='relu', name='dense_1')(inputs)\nx = layers.Dense(64, activation='relu', name='dense_2')(x)\noutputs = layers.Dense(10, name='predictions')(x)\nmodel = keras.Model(inputs=inputs, outputs=outputs)\n\n# Instantiate an optimizer.\noptimizer = keras.optimizers.SGD(learning_rate=1e-3)\n# Instantiate a loss function.\nloss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)\n\n# Prepare the metrics.\ntrain_acc_metric = keras.metrics.SparseCategoricalAccuracy()\nval_acc_metric = keras.metrics.SparseCategoricalAccuracy()\n\n# Prepare the training dataset.\nbatch_size = 64\ntrain_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))\ntrain_dataset = train_dataset.shuffle(buffer_size=1024).batch(batch_size)\n\n# Prepare the validation dataset.\nval_dataset = tf.data.Dataset.from_tensor_slices((x_val, y_val))\nval_dataset = val_dataset.batch(64)\n\nepochs = 3\nfor epoch in range(epochs):\n print('Start of epoch %d' % (epoch, ))\n\n # Iterate over the batches of the dataset.\n for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):\n\n # Open a GradientTape to record the operations run\n # during the forward pass, which enables autodifferentiation.\n with tf.GradientTape() as tape:\n\n # Run the forward pass of the layer.\n # The operations that the layer applies\n # to its inputs are going to be recorded\n # on the GradientTape.\n logits = model(x_batch_train,\n training=True) # Logits for this minibatch\n\n # Compute the loss value for this minibatch.\n loss_value = loss_fn(y_batch_train, logits)\n\n # Use the gradient tape to automatically retrieve\n # the gradients of the trainable variables with respect to the loss.\n grads = tape.gradient(loss_value, model.trainable_weights)\n\n # Run one step of gradient descent by updating\n # the value of the variables to minimize the loss.\n optimizer.apply_gradients(zip(grads, model.trainable_weights))\n\n # Update training metric.\n train_acc_metric(y_batch_train, logits)\n\n # Log every 200 batches.\n if step % 200 == 0:\n print('Training loss (for one batch) at step %s: %s' %\n (step, float(loss_value)))\n print('Seen so far: %s samples' % ((step + 1) * 64))\n\n # Display metrics at the end of each epoch.\n train_acc = train_acc_metric.result()\n print('Training acc over epoch: %s' % (float(train_acc), ))\n # Reset training metrics at the end of each epoch\n train_acc_metric.reset_states()\n\n # Run a validation loop at the end of each epoch.\n for x_batch_val, y_batch_val in val_dataset:\n val_logits = model(x_batch_val)\n # Update val metrics\n val_acc_metric(y_batch_val, val_logits)\n val_acc = val_acc_metric.result()\n val_acc_metric.reset_states()\n print('Validation acc: %s' % (float(val_acc), ))"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"3","normalizeStart":"3"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"注意 "},{"type":"codeinline","content":[{"type":"text","text":"with tf.GradientTape() as tape"}]},{"type":"text","text":" 部分的实现,它记录了前向传播的过程,然后使用 "},{"type":"codeinline","content":[{"type":"text","text":"tape.gradient"}]},{"type":"text","text":" 方法计算出 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 关于模型所有权重矩阵 ("},{"type":"codeinline","content":[{"type":"text","text":"model.trainable_weights"}]},{"type":"text","text":") 的导数(也称作梯度),接着利用优化器 ("},{"type":"codeinline","content":[{"type":"text","text":"optimizer"}]},{"type":"text","text":") 去更新所有的权重矩阵。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"4","normalizeStart":"4"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"在上述训练流程中,模型的训练指标在每个 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 的训练中进行更新操作 ("},{"type":"codeinline","content":[{"type":"text","text":"update_state()"}]},{"type":"text","text":") ,在一个 "},{"type":"codeinline","content":[{"type":"text","text":"epoch"}]},{"type":"text","text":" 训练结束后打印指标的结果 ("},{"type":"codeinline","content":[{"type":"text","text":"result()"}]},{"type":"text","text":") ,然后重置该指标 ("},{"type":"codeinline","content":[{"type":"text","text":"reset_states()"}]},{"type":"text","text":") 并进行下一轮的指标记录,交叉验证的指标也是同样的操作。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"5","normalizeStart":"5"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"注意与使用模型内置 "},{"type":"codeinline","content":[{"type":"text","text":"API"}]},{"type":"text","text":" 进行训练不同,在自定义训练中,模型中定义的损失,比如正则化损失以及通过 "},{"type":"codeinline","content":[{"type":"text","text":"add_loss"}]},{"type":"text","text":" 添加的损失,是不会自动累加在 "},{"type":"codeinline","content":[{"type":"text","text":"loss_fn"}]},{"type":"text","text":" 之内的。如果要包含这部分损失,则需要修改自定义训练的流程,通过调用 "},{"type":"codeinline","content":[{"type":"text","text":"model.losses"}]},{"type":"text","text":" 来将模型的全部损失加入到要优化的损失中去。示例代码如下所示:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"with tf.GradientTape() as tape:\n logits = model(x_batch_train)\n loss_value = loss_fn(y_batch_train, logits)\n\n # Add extra losses created during this forward pass:\n loss_value += sum(model.losses)\ngrads = tape.gradient(loss_value, model.trainable_weights)\noptimizer.apply_gradients(zip(grads, model.trainable_weights))"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"参考资料"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"1","normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://www.tensorflow.org/guide/keras/trainandevaluate","title":"Keras 模型训练与评估"},"content":[{"type":"text","text":"Keras 模型训练与评估"}]}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit","title":"Keras 模型 fit 方法"},"content":[{"type":"text","text":"Keras 模型 fit 方法"}]}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://www.tensorflow.org/api_docs/python/tf/data/Dataset","title":"tf.data.Dataset"},"content":[{"type":"text","text":"tf.data.Dataset"}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章