TensorFlow 篇 | TensorFlow 2.x 基於 Keras 模型的本地訓練與評估

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/2e/2e4058b36141b9f93901b5367662fbbe.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"「"},{"type":"text","marks":[{"type":"strong"}],"text":"導語"},{"type":"text","text":"」模型的訓練與評估是整個機器學習任務流程的核心環節。只有掌握了正確的訓練與評估方法,並靈活使用,才能使我們更加快速地進行實驗分析與驗證,從而對模型有更加深刻的理解。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"前言"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上一篇 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 模型構建的文章中,我們介紹了在 "},{"type":"codeinline","content":[{"type":"text","text":"TensorFlow 2.x"}]},{"type":"text","text":" 版本中使用 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 構建模型的三種方法,那麼本篇將在上一篇的基礎上着重介紹使用 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 模型進行本地訓練、評估以及預測的流程和方法。 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 模型有兩種訓練評估的方式,一種方式是使用模型內置 "},{"type":"codeinline","content":[{"type":"text","text":"API"}]},{"type":"text","text":" ,如 "},{"type":"codeinline","content":[{"type":"text","text":"model.fit()"}]},{"type":"text","text":" , "},{"type":"codeinline","content":[{"type":"text","text":"model.evaluate()"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"model.predict()"}]},{"type":"text","text":" 等分別執行不同的操作;另一種方式是利用即時執行策略 ("},{"type":"codeinline","content":[{"type":"text","text":"eager execution"}]},{"type":"text","text":") 以及 "},{"type":"codeinline","content":[{"type":"text","text":"GradientTape"}]},{"type":"text","text":" 對象自定義訓練和評估流程。對所有 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 模型來說這兩種方式都是按照相同的原理來工作的,沒有本質上的區別。在一般情況下,我們更願意使用第一種訓練評估方式,因爲它更爲簡單,更易於使用,而在一些特殊的情況下,我們可能會考慮使用自定義的方式來完成訓練與評估。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"內置 API 進行訓練評估"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"端到端完整示例"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面介紹使用模型內置 "},{"type":"codeinline","content":[{"type":"text","text":"API"}]},{"type":"text","text":" 實現的一個端到端的訓練評估示例,可以認爲要使用該模型去解決一個多分類問題。這裏使用了"},{"type":"codeinline","content":[{"type":"text","text":"函數式 API"}]},{"type":"text","text":" 來構建 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 模型,當然也可以使用 "},{"type":"codeinline","content":[{"type":"text","text":"Sequential"}]},{"type":"text","text":" 方式以及子類化方式去定義模型。示例代碼如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"import tensorflow as tf\nfrom tensorflow import keras\nfrom tensorflow.keras import layers\nimport numpy as np\n\n# Train and Test data from numpy array.\nx_train, y_train = (\n np.random.random((60000, 784)),\n np.random.randint(10, size=(60000, 1)),\n)\nx_test, y_test = (\n np.random.random((10000, 784)),\n np.random.randint(10, size=(10000, 1)),\n)\n\n# Reserve 10,000 samples for validation.\nx_val = x_train[-10000:]\ny_val = y_train[-10000:]\nx_train = x_train[:-10000]\ny_train = y_train[:-10000]\n\n# Model Create\ninputs = keras.Input(shape=(784, ), name='digits')\nx = layers.Dense(64, activation='relu', name='dense_1')(inputs)\nx = layers.Dense(64, activation='relu', name='dense_2')(x)\noutputs = layers.Dense(10, name='predictions')(x)\nmodel = keras.Model(inputs=inputs, outputs=outputs)\n\n# Model Compile.\nmodel.compile(\n # Optimizer\n optimizer=keras.optimizers.RMSprop(),\n # Loss function to minimize\n loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n # List of metrics to monitor\n metrics=['sparse_categorical_accuracy'],\n)\n\n# Model Training.\nprint('# Fit model on training data')\nhistory = model.fit(\n x_train,\n y_train,\n batch_size=64,\n epochs=3,\n # We pass some validation for monitoring validation loss and metrics\n # at the end of each epoch\n validation_data=(x_val, y_val),\n)\nprint('\\nhistory dict:', history.history)\n\n# Model Evaluate.\nprint('\\n# Evaluate on test data')\nresults = model.evaluate(x_test, y_test, batch_size=128)\nprint('test loss, test acc:', results)\n\n# Generate predictions (probabilities -- the output of the last layer)\n# Model Predict.\nprint('\\n# Generate predictions for 3 samples')\npredictions = model.predict(x_test[:3])\nprint('predictions shape:', predictions.shape)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從代碼中可以看到,要完成模型的訓練與評估的整體流程,首先要構建好模型;然後要對模型進行編譯 ("},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":"),目的是指定模型訓練過程中需要用到的優化器 ("},{"type":"codeinline","content":[{"type":"text","text":"optimizer"}]},{"type":"text","text":"),損失函數 ("},{"type":"codeinline","content":[{"type":"text","text":"losses"}]},{"type":"text","text":") 以及評估指標 ("},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":") ;接着開始進行模型的訓練與交叉驗證 ("},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":"),此步驟需要提前指定好訓練數據和驗證數據,並設置好一些參數如 "},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":" 等才能繼續,交叉驗證操作會在每輪 ("},{"type":"codeinline","content":[{"type":"text","text":"epoch"}]},{"type":"text","text":") 訓練結束後自動觸發;最後是模型評估 ("},{"type":"codeinline","content":[{"type":"text","text":"evaluate"}]},{"type":"text","text":") 與預測 ("},{"type":"codeinline","content":[{"type":"text","text":"predict"}]},{"type":"text","text":"),我們會根據評估與預測結果來判斷模型的好壞。這樣一個完整的模型訓練與評估流程就完成了,下面來對示例裏的一些實現細節進行展開講解。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"模型編譯 (compile)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"1","normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"在模型訓練之前首先要進行模型編譯,因爲只有知道了要優化什麼目標,如何進行優化以及要關注什麼指標,模型才能被正確的訓練與調整。 "},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":" 方法包含三個主要參數,一個是待優化的損失 ("},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":") ,它指明瞭要優化的目標,一個是優化器 ("},{"type":"codeinline","content":[{"type":"text","text":"optimizer"}]},{"type":"text","text":"),它指明瞭目標優化的方向,還有一個可選的指標項 ("},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":"),它指明瞭訓練過程中要關注的模型指標。 "},{"type":"codeinline","content":[{"type":"text","text":"Keras API"}]},{"type":"text","text":" 中已經包含了許多內置的損失函數,優化器以及指標,可以拿來即用,能夠滿足大多數的訓練需要。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"2","normalizeStart":"2"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"損失函數類主要在 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.losses"}]},{"type":"text","text":" 模塊下,其中包含了多種預定義的損失,比如我們常用的二分類損失 "},{"type":"codeinline","content":[{"type":"text","text":"BinaryCrossentropy"}]},{"type":"text","text":" ,多分類損失 "},{"type":"codeinline","content":[{"type":"text","text":"CategoricalCrossentropy"}]},{"type":"text","text":" 以及均方根損失 "},{"type":"codeinline","content":[{"type":"text","text":"MeanSquaredError"}]},{"type":"text","text":" 等。傳遞給 "},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":" 的參數既可以是一個字符串如 "},{"type":"codeinline","content":[{"type":"text","text":"binary_crossentropy"}]},{"type":"text","text":" 也可以是對應的 "},{"type":"codeinline","content":[{"type":"text","text":"losses"}]},{"type":"text","text":" 實例如 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.losses.BinaryCrossentropy()"}]},{"type":"text","text":" ,當我們需要設置損失函數的一些參數時(比如上例中 "},{"type":"codeinline","content":[{"type":"text","text":"from_logits=True"}]},{"type":"text","text":"),則需要使用實例參數。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"3","normalizeStart":"3"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"優化器類主要在 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.optimizers"}]},{"type":"text","text":" 模塊下,一些常用的優化器如 "},{"type":"codeinline","content":[{"type":"text","text":"SGD"}]},{"type":"text","text":" , "},{"type":"codeinline","content":[{"type":"text","text":"Adam"}]},{"type":"text","text":" 以及 "},{"type":"codeinline","content":[{"type":"text","text":"RMSprop"}]},{"type":"text","text":" 等均包含在內。同樣它也可以通過字符串或者實例的方式傳給 "},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":" 方法,一般我們需要設置的優化器參數主要爲學習率 ("},{"type":"codeinline","content":[{"type":"text","text":"learning rate"}]},{"type":"text","text":") ,其他的參數可以參考每個優化器的具體實現來動態設置,或者直接使用其默認值即可。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"4","normalizeStart":"4"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"指標類主要在 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.metrics"}]},{"type":"text","text":" 模塊下,二分類裏常用的 "},{"type":"codeinline","content":[{"type":"text","text":"AUC"}]},{"type":"text","text":" 指標以及 "},{"type":"codeinline","content":[{"type":"text","text":"lookalike"}]},{"type":"text","text":" 裏常用的召回率 ("},{"type":"codeinline","content":[{"type":"text","text":"Recall"}]},{"type":"text","text":") 指標等均有包含。同理,它也可以以字符串或者實例的形式傳遞給 "},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":" 方法,注意 "},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":" 方法接收的是一個 "},{"type":"codeinline","content":[{"type":"text","text":"metric"}]},{"type":"text","text":" 列表,所以可以傳遞多個指標信息。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"5","normalizeStart":"5"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"當然如果 "},{"type":"codeinline","content":[{"type":"text","text":"losses"}]},{"type":"text","text":" 模塊下的損失或 "},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":" 模塊下的指標不滿足你的需求,也可以自定義它們的實現。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.1. 對於自定義損失,有兩種方式,一種是定義一個損失函數,它接收兩個輸入參數 "},{"type":"codeinline","content":[{"type":"text","text":"y_true"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"y_pred"}]},{"type":"text","text":" ,然後在函數內部計算損失並返回。代碼如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"def basic_loss_function(y_true, y_pred):\n return tf.math.reduce_mean(tf.abs(y_true - y_pred))\n\nmodel.compile(optimizer=keras.optimizers.Adam(), loss=basic_loss_function)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.2. 如果你需要的損失函數不僅僅包含上述兩個參數,則可以採用另外一種子類化的方式來實現。定義一個類繼承自 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.losses.Loss"}]},{"type":"text","text":" 類,並實現其 "},{"type":"codeinline","content":[{"type":"text","text":"__init__(self)"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"call(self, y"},{"type":"text","marks":[{"type":"italic"}],"text":"true, y"},{"type":"text","text":"pred)"}]},{"type":"text","text":" 方法,這種實現方式與子類化層和模型比較相似。比如要實現一個加權的二分類交叉熵損失,其代碼如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"class WeightedBinaryCrossEntropy(keras.losses.Loss):\n \"\"\"\n Args:\n pos_weight: Scalar to affect the positive labels of the loss function.\n weight: Scalar to affect the entirety of the loss function.\n from_logits: Whether to compute loss from logits or the probability.\n reduction: Type of tf.keras.losses.Reduction to apply to loss.\n name: Name of the loss function.\n \"\"\"\n def __init__(self,\n pos_weight,\n weight,\n from_logits=False,\n reduction=keras.losses.Reduction.AUTO,\n name='weighted_binary_crossentropy'):\n super().__init__(reduction=reduction, name=name)\n self.pos_weight = pos_weight\n self.weight = weight\n self.from_logits = from_logits\n\n def call(self, y_true, y_pred):\n ce = tf.losses.binary_crossentropy(\n y_true,\n y_pred,\n from_logits=self.from_logits,\n )[:, None]\n ce = self.weight * (ce * (1 - y_true) + self.pos_weight * ce * y_true)\n return ce\n\nmodel.compile(\n optimizer=keras.optimizers.Adam(),\n loss=WeightedBinaryCrossEntropy(\n pos_weight=0.5,\n weight=2,\n from_logits=True,\n ),\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.3. 對於自定義指標,也可以通過子類化的方式來實現,首先定義一個指標類繼承自 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.metrics.Metric"}]},{"type":"text","text":" 類並實現其四個方法,分別是 "},{"type":"codeinline","content":[{"type":"text","text":"__init__(self)"}]},{"type":"text","text":" 方法,用來創建狀態變量, "},{"type":"codeinline","content":[{"type":"text","text":"update_state(self, y_true, y_pred, sample_weight=None)"}]},{"type":"text","text":" 方法,用來更新狀態變量, "},{"type":"codeinline","content":[{"type":"text","text":"result(self)"}]},{"type":"text","text":" 方法,用來返回狀態變量的最終結果, 以及 "},{"type":"codeinline","content":[{"type":"text","text":"reset_states(self)"}]},{"type":"text","text":" 方法,用來重新初始化狀態變量。比如要實現一個多分類中"},{"type":"codeinline","content":[{"type":"text","text":"真正例 (True Positives) 數量"}]},{"type":"text","text":"的統計指標,其代碼如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"class CategoricalTruePositives(keras.metrics.Metric):\n def __init__(self, name='categorical_true_positives', **kwargs):\n super().__init__(name=name, **kwargs)\n self.true_positives = self.add_weight(name='tp', initializer='zeros')\n\n def update_state(self, y_true, y_pred, sample_weight=None):\n y_pred = tf.reshape(tf.argmax(y_pred, axis=1), shape=(-1, 1))\n values = tf.cast(y_true, 'int32') == tf.cast(y_pred, 'int32')\n values = tf.cast(values, 'float32')\n if sample_weight is not None:\n sample_weight = tf.cast(sample_weight, 'float32')\n values = tf.multiply(values, sample_weight)\n self.true_positives.assign_add(tf.reduce_sum(values))\n\n def result(self):\n return self.true_positives\n\n def reset_states(self):\n # The state of the metric will be reset at the start of each epoch.\n self.true_positives.assign(0.)\n\nmodel.compile(\n optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),\n loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n metrics=[CategoricalTruePositives()],\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.4. 對於一些在層 ("},{"type":"codeinline","content":[{"type":"text","text":"layers"}]},{"type":"text","text":") 內部定義的損失,可以通過在自定義層的 "},{"type":"codeinline","content":[{"type":"text","text":"call"}]},{"type":"text","text":" 方法裏調用 "},{"type":"codeinline","content":[{"type":"text","text":"self.add_loss()"}]},{"type":"text","text":" 來實現,而且在模型訓練時,它會自動添加到整體的損失中,不用人爲干預。通過對比加入自定義損失前後模型訓練輸出的 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 值的變化來確認這部分損失是否被加入到了整體的損失中。還可以在 "},{"type":"codeinline","content":[{"type":"text","text":"build"}]},{"type":"text","text":" 模型後,打印 "},{"type":"codeinline","content":[{"type":"text","text":"model.losses"}]},{"type":"text","text":" 來查看該模型的所有損失。注意正則化損失是內置在 "},{"type":"codeinline","content":[{"type":"text","text":"Keras"}]},{"type":"text","text":" 的所有層中的,只需要在調用層時加入相應正則化參數即可,無需在 "},{"type":"codeinline","content":[{"type":"text","text":"call"}]},{"type":"text","text":" 方法中 "},{"type":"codeinline","content":[{"type":"text","text":"add_loss()"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.5. 對於指標信息來說,可以在自定義層的 "},{"type":"codeinline","content":[{"type":"text","text":"call"}]},{"type":"text","text":" 方法裏調用 "},{"type":"codeinline","content":[{"type":"text","text":"self.add_metric()"}]},{"type":"text","text":" 來新增指標,同樣的,它也會自動出現在整體的指標中,無需人爲干預。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.6. "},{"type":"codeinline","content":[{"type":"text","text":"函數式 API"}]},{"type":"text","text":" 實現的模型,可以通過調用 "},{"type":"codeinline","content":[{"type":"text","text":"model.add_loss()"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"model.add_metric()"}]},{"type":"text","text":" 來實現與自定義模型同樣的效果。示例代碼如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"import tensorflow as tf\nfrom tensorflow import keras\nfrom tensorflow.keras import layers\n\ninputs = keras.Input(shape=(784, ), name='digits')\nx1 = layers.Dense(64, activation='relu', name='dense_1')(inputs)\nx2 = layers.Dense(64, activation='relu', name='dense_2')(x1)\noutputs = layers.Dense(10, name='predictions')(x2)\nmodel = keras.Model(inputs=inputs, outputs=outputs)\n\nmodel.add_loss(tf.reduce_sum(x1) * 0.1)\n\nmodel.add_metric(\n keras.backend.std(x1),\n name='std_of_activation',\n aggregation='mean',\n)\n\nmodel.compile(\n optimizer=keras.optimizers.RMSprop(1e-3),\n loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n)\nmodel.fit(x_train, y_train, batch_size=64, epochs=1)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"6","normalizeStart":"6"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":6,"align":null,"origin":null},"content":[{"type":"text","text":"如果要編譯的是多輸入多輸出模型,則可以爲每一個輸出指定不同的損失函數以及不同的指標,後面會詳細介紹。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"模型訓練與驗證 (fit)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"1","normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"模型的訓練通過調用 "},{"type":"codeinline","content":[{"type":"text","text":"model.fit()"}]},{"type":"text","text":" 方法來實現, "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法包括訓練數據與驗證數據參數,它們可以是 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 類型數據,也可以是 "},{"type":"codeinline","content":[{"type":"text","text":"tf.data"}]},{"type":"text","text":" 模塊下 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 類型的數據。另外 "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法還包括 "},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":" , "},{"type":"codeinline","content":[{"type":"text","text":"batch_size"}]},{"type":"text","text":" 以及 "},{"type":"codeinline","content":[{"type":"text","text":"steps"},{"type":"text","marks":[{"type":"italic"}],"text":"per"},{"type":"text","text":"epoch"}]},{"type":"text","text":" 等控制訓練流程的參數,並且還可以通過 "},{"type":"codeinline","content":[{"type":"text","text":"callbacks"}]},{"type":"text","text":" 參數控制模型在訓練過程中執行一些其它的操作,如 "},{"type":"codeinline","content":[{"type":"text","text":"Tensorboard"}]},{"type":"text","text":" 日誌記錄等。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"2","normalizeStart":"2"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"模型的訓練和驗證數據可以是 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 類型數據,最開始的端到端示例即是採用 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 數組作爲輸入。一般在數據量較小且內存能容下的情況下采用 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 數據作爲訓練和評估的數據輸入。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 2.1. 對於 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 類型數據來說,如果指定了 "},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":" 參數,則訓練數據的總量爲"},{"type":"codeinline","content":[{"type":"text","text":"原始樣本數量 * epochs"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 2.2. 默認情況下一輪訓練 ("},{"type":"codeinline","content":[{"type":"text","text":"epoch"}]},{"type":"text","text":") 所有的原始樣本都會被訓練一遍,下一輪訓練還會使用這些樣本數據進行訓練,每一輪執行的步數 ("},{"type":"codeinline","content":[{"type":"text","text":"steps"}]},{"type":"text","text":") 爲"},{"type":"codeinline","content":[{"type":"text","text":"原始樣本數量/batch_size"}]},{"type":"text","text":" ,如果 "},{"type":"codeinline","content":[{"type":"text","text":"batch_size"}]},{"type":"text","text":" 不指定,默認爲 "},{"type":"codeinline","content":[{"type":"text","text":"32"}]},{"type":"text","text":" 。交叉驗證在"},{"type":"codeinline","content":[{"type":"text","text":"每一輪訓練結束後"}]},{"type":"text","text":"觸發,並且也會在所有驗證樣本上執行一遍,可以指定 "},{"type":"codeinline","content":[{"type":"text","text":"validation"},{"type":"text","marks":[{"type":"italic"}],"text":"batch"},{"type":"text","text":"size"}]},{"type":"text","text":" 來控制驗證數據的 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 大小,如果不指定默認同 "},{"type":"codeinline","content":[{"type":"text","text":"batch_size"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 2.3. 對於 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 類型數據來說,如果設置了 "},{"type":"codeinline","content":[{"type":"text","text":"steps_per_epoch"}]},{"type":"text","text":" 參數,表示一輪要訓練指定的步數,下一輪會在上輪基礎上使用下一個 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 的數據繼續進行訓練,直到所有的 "},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":" 結束或者"},{"type":"codeinline","content":[{"type":"text","text":"訓練數據的總量"}]},{"type":"text","text":"被耗盡。要想訓練流程不因數據耗盡而結束,則需要保證"},{"type":"codeinline","content":[{"type":"text","text":"數據的總量"}]},{"type":"text","text":"要大於 "},{"type":"codeinline","content":[{"type":"text","text":"steps"},{"type":"text","marks":[{"type":"italic"}],"text":"per"},{"type":"text","text":"epoch "},{"type":"text","marks":[{"type":"italic"}],"text":" epochs "},{"type":"text","text":" batch_size"}]},{"type":"text","text":"。同理也可以設置 "},{"type":"codeinline","content":[{"type":"text","text":"validation_steps"}]},{"type":"text","text":" ,表示交叉驗證所需步數,此時要注意驗證集的數據總量要大於 "},{"type":"codeinline","content":[{"type":"text","text":"validation"},{"type":"text","marks":[{"type":"italic"}],"text":"steps * validation"},{"type":"text","text":"batch_size"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 2.4. "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法還提供了另外一個參數 "},{"type":"codeinline","content":[{"type":"text","text":"validation_split"}]},{"type":"text","text":" 來自動從訓練數據集中保留一定比例的數據作爲驗證,該參數取值爲 "},{"type":"codeinline","content":[{"type":"text","text":"0-1"}]},{"type":"text","text":" 之間,比如 "},{"type":"codeinline","content":[{"type":"text","text":"0.2"}]},{"type":"text","text":" 代表 "},{"type":"codeinline","content":[{"type":"text","text":"20%"}]},{"type":"text","text":" 的訓練集用來做驗證, "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法會默認保留 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 數組最後面 "},{"type":"codeinline","content":[{"type":"text","text":"20%"}]},{"type":"text","text":" 的樣本作爲驗證集。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"3","normalizeStart":"3"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"codeinline","content":[{"type":"text","text":"TensorFlow 2.0"}]},{"type":"text","text":" 之後,更爲推薦的是使用 "},{"type":"codeinline","content":[{"type":"text","text":"tf.data"}]},{"type":"text","text":" 模塊下 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 類型的數據作爲訓練和驗證的數據輸入,它能以更加快速以及可擴展的方式加載和預處理數據。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.1. 使用 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 進行訓練的代碼如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))\n# Shuffle and slice the dataset.\ntrain_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)\n\n# Prepare the validation dataset\nval_dataset = tf.data.Dataset.from_tensor_slices((x_val, y_val))\nval_dataset = val_dataset.batch(64)\n\n# Now we get a test dataset.\ntest_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test))\ntest_dataset = test_dataset.batch(64)\n\n# Since the dataset already takes care of batching,\n# we don't pass a `batch_size` argument.\nmodel.fit(train_dataset, epochs=3, validation_data=val_dataset)\nresult = model.evaluate(test_dataset)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.2. "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 一般是一個二元組,第一個元素爲模型的輸入特徵,如果爲多輸入就是多個特徵的字典 ("},{"type":"codeinline","content":[{"type":"text","text":"dict"}]},{"type":"text","text":") 或元組 ("},{"type":"codeinline","content":[{"type":"text","text":"tuple"}]},{"type":"text","text":"),第二個元素是真實的數據標籤 ("},{"type":"codeinline","content":[{"type":"text","text":"label"}]},{"type":"text","text":") ,即 "},{"type":"codeinline","content":[{"type":"text","text":"ground truth"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.3. 使用 "},{"type":"codeinline","content":[{"type":"text","text":"from_tensor_slices"}]},{"type":"text","text":" 方法可以從 "},{"type":"codeinline","content":[{"type":"text","text":"nunpy"}]},{"type":"text","text":" 數組直接生成 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 類型數據,是一種比較方便快捷的生成方式,一般在測試時使用。其它較爲常用的生成方式,比如從 "},{"type":"codeinline","content":[{"type":"text","text":"TFRecord"}]},{"type":"text","text":" 文件或文本文件 ("},{"type":"codeinline","content":[{"type":"text","text":"TextLine"}]},{"type":"text","text":") 中生成 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" ,可以參考 "},{"type":"codeinline","content":[{"type":"text","text":"tf.data"}]},{"type":"text","text":" 模塊下的相關類的具體實現。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.4. "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 可以調用內置方法提前對數據進行預處理,比如數據打亂 ("},{"type":"codeinline","content":[{"type":"text","text":"shuffle"}]},{"type":"text","text":"), "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 以及 "},{"type":"codeinline","content":[{"type":"text","text":"repeat"}]},{"type":"text","text":" 等操作。"},{"type":"codeinline","content":[{"type":"text","text":"shuffle"}]},{"type":"text","text":" 操作是爲了減小模型過擬合的機率,它僅爲小範圍打亂,需要藉助於一個緩存區,先將數據填滿,然後在每次訓練時從緩存區裏隨機抽取 "},{"type":"codeinline","content":[{"type":"text","text":"batch_size"}]},{"type":"text","text":" 條數據,產生的空缺用後面的數據填充,從而實現了局部打亂的效果。"},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 是對數據進行分批次,常用於控制和調節模型的訓練速度以及訓練效果,因爲在 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 中已經 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 過,所以 "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法中的 "},{"type":"codeinline","content":[{"type":"text","text":"batch_size"}]},{"type":"text","text":" 就無需再提供了。"},{"type":"codeinline","content":[{"type":"text","text":"repeat"}]},{"type":"text","text":" 用來對數據進行復制,以解決數據量不足的問題,如果指定了其參數 "},{"type":"codeinline","content":[{"type":"text","text":"count"}]},{"type":"text","text":",則表示整個數據集要複製 "},{"type":"codeinline","content":[{"type":"text","text":"count"}]},{"type":"text","text":" 次,不指定就會"},{"type":"codeinline","content":[{"type":"text","text":"無限次複製"}]},{"type":"text","text":" ,此時必須要設置 "},{"type":"codeinline","content":[{"type":"text","text":"steps"},{"type":"text","marks":[{"type":"italic"}],"text":"per"},{"type":"text","text":"epoch"}]},{"type":"text","text":" 參數,不然訓練無法終止。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.5. 上述例子中, "},{"type":"codeinline","content":[{"type":"text","text":"train dataset"}]},{"type":"text","text":" 的全部數據在每一輪都會被訓練到,因爲一輪訓練結束後, "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 會重置,然後被用來重新訓練。但是當指定了 "},{"type":"codeinline","content":[{"type":"text","text":"steps_per_epoch"}]},{"type":"text","text":" 之後, "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 在每輪訓練後不會被重置,一直到所有 "},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":" 結束或所有的訓練數據被消耗完之後終止,要想訓練正常結束,須保證提供的訓練數據總量要大於 "},{"type":"codeinline","content":[{"type":"text","text":"steps_per_epoch "},{"type":"text","marks":[{"type":"italic"}],"text":" epochs "},{"type":"text","text":" batch_size"}]},{"type":"text","text":"。同理也可以指定 "},{"type":"codeinline","content":[{"type":"text","text":"validation_steps"}]},{"type":"text","text":" ,此時數據驗證會執行指定的步數,在下次驗證開始時, "},{"type":"codeinline","content":[{"type":"text","text":"validation dataset"}]},{"type":"text","text":" 會被重置,以保證每次交叉驗證使用的都是相同的數據。"},{"type":"codeinline","content":[{"type":"text","text":"validation_split"}]},{"type":"text","text":" 參數不適用於 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 類型數據,因爲它需要知道每個數據樣本的索引,這在 "},{"type":"codeinline","content":[{"type":"text","text":"dataset API"}]},{"type":"text","text":" 下很難實現。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.6. 當不指定 "},{"type":"codeinline","content":[{"type":"text","text":"steps_per_epoch"}]},{"type":"text","text":" 參數時, "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 類型數據與 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 類型數據的處理流程完全一致。但當指定之後,要注意它們之間在處理上的差異。對於 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 類型數據而言,在處理時,它會被轉爲 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 類型數據,只不過這個 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 被 "},{"type":"codeinline","content":[{"type":"text","text":"repeat"}]},{"type":"text","text":" 了 "},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":" 次,而且每輪訓練結束後,這個 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 不會被重置,會在上次的 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 之後繼續訓練。假設原始數據量爲 "},{"type":"codeinline","content":[{"type":"text","text":"n"}]},{"type":"text","text":" ,指定 "},{"type":"codeinline","content":[{"type":"text","text":"steps"},{"type":"text","marks":[{"type":"italic"}],"text":"per"},{"type":"text","text":"epoch"}]},{"type":"text","text":" 參數之後,兩者的差異主要體現在真實的訓練數據量上, "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 爲 "},{"type":"codeinline","content":[{"type":"text","text":"n * epochs"}]},{"type":"text","text":" , "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 爲 "},{"type":"codeinline","content":[{"type":"text","text":"n"}]},{"type":"text","text":"。具體細節可以參考源碼實現。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 3.7. "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 還有 "},{"type":"codeinline","content":[{"type":"text","text":"map"}]},{"type":"text","text":" 與 "},{"type":"codeinline","content":[{"type":"text","text":"prefetch"}]},{"type":"text","text":" 方法比較實用。 "},{"type":"codeinline","content":[{"type":"text","text":"map"}]},{"type":"text","text":" 方法接收一個"},{"type":"codeinline","content":[{"type":"text","text":"函數"}]},{"type":"text","text":"作爲參數,用來對 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 中的每一條數據進行處理並返回一個新的 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" ,比如我們在使用 "},{"type":"codeinline","content":[{"type":"text","text":"TextLineDataset"}]},{"type":"text","text":" 讀取文本文件後生成了一個 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" ,而我們要抽取輸入數據中的某些列作爲特徵 ("},{"type":"codeinline","content":[{"type":"text","text":"features"}]},{"type":"text","text":"),某些列作爲標籤 ("},{"type":"codeinline","content":[{"type":"text","text":"labels"}]},{"type":"text","text":"),此時就會用到 "},{"type":"codeinline","content":[{"type":"text","text":"map"}]},{"type":"text","text":" 方法。"},{"type":"codeinline","content":[{"type":"text","text":"prefetch"}]},{"type":"text","text":" 方法預先從 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 中準備好下次訓練所需的數據並放於內存中,這樣可以減少每輪訓練之間的延遲等待時間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"4","normalizeStart":"4"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"除了訓練數據和驗證數據外,還可以向 "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法傳遞樣本權重 ("},{"type":"codeinline","content":[{"type":"text","text":"sample_weight"}]},{"type":"text","text":") 以及類別權重 ("},{"type":"codeinline","content":[{"type":"text","text":"class_weight"}]},{"type":"text","text":") 參數。這兩個參數通常被用於處理分類不平衡問題,通過給類別少的樣本賦予更高的權重,使得各個類別對整體損失的貢獻趨於一致。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 4.1. 對於 "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 類型的輸入數據,可以使用上述兩個參數,以上面的多分類問題爲例,如果要給分類 "},{"type":"codeinline","content":[{"type":"text","text":"5"}]},{"type":"text","text":" 一個更高的權重,可以使用如下代碼來實現:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"import numpy as np\n\n# Here's the same example using `class_weight`\nclass_weight = {0: 1., 1: 1., 2: 1., 3: 1., 4: 1.,\n # Set weight \"2\" for class \"5\",\n # making this class 2x more important\n 5: 2.,\n 6: 1., 7: 1., 8: 1., 9: 1.}\nprint('Fit with class weight')\nmodel.fit(x_train, y_train, class_weight=class_weight, batch_size=64, epochs=4)\n\n# Here's the same example using `sample_weight` instead:\nsample_weight = np.ones(shape=(len(y_train), ))\nsample_weight[y_train == 5] = 2.\nprint('\\nFit with sample weight')\n\nmodel.fit(\n x_train,\n y_train,\n sample_weight=sample_weight,\n batch_size=64,\n epochs=4,\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 4.2. 而對於 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 類型的輸入數據來說,不能直接使用上述兩個參數,需要在構建 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 時將 "},{"type":"codeinline","content":[{"type":"text","text":"sample_weight"}]},{"type":"text","text":" 加入其中,返回一個三元組的 "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" ,格式爲 "},{"type":"codeinline","content":[{"type":"text","text":"(input_batch, target_batch, sample"},{"type":"text","marks":[{"type":"italic"}],"text":"weight"},{"type":"text","text":"batch)"}]},{"type":"text","text":" 。示例代碼如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"sample_weight = np.ones(shape=(len(y_train), ))\nsample_weight[y_train == 5] = 2.\n\n# Create a Dataset that includes sample weights\n# (3rd element in the return tuple).\ntrain_dataset = tf.data.Dataset.from_tensor_slices((\n x_train,\n y_train,\n sample_weight,\n))\n\n# Shuffle and slice the dataset.\ntrain_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)\n\nmodel.fit(train_dataset, epochs=3)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"5","normalizeStart":"5"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"在模型的訓練過程中有一些特殊時間點,比如在一個 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 結束或者一個 "},{"type":"codeinline","content":[{"type":"text","text":"epoch"}]},{"type":"text","text":" 結束時,一般都會做一些額外的處理操作來輔助我們進行訓練,上面介紹過的模型交叉驗證就是其中之一。還有一些其它的操作,比如當模型訓練停滯不前時 ("},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 值在某一值附近不斷波動),自動減小其學習速率 ("},{"type":"codeinline","content":[{"type":"text","text":"learning rate"}]},{"type":"text","text":") 以使損失繼續下降,從而得到更好的收斂效果;在訓練過程中保存模型的權重信息,以備重啓模型時可以在已有權重的基礎上繼續訓練,從而減少訓練時間;還有在每輪的訓練結束後記錄模型的損失 ("},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":") 和指標 ("},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":") 信息,以供 "},{"type":"codeinline","content":[{"type":"text","text":"Tensorboard"}]},{"type":"text","text":" 分析使用等等,這些操作都是模型訓練過程中不可或缺的部分。它們都可以通過回調函數 ("},{"type":"codeinline","content":[{"type":"text","text":"callbacks"}]},{"type":"text","text":") 的方式來實現,這些回調函數都在 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.callbacks"}]},{"type":"text","text":" 模塊下,可以將它們作爲列表參數傳遞給 "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 方法以達到不同的操作目的。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.1. 下面以 "},{"type":"codeinline","content":[{"type":"text","text":"EarlyStopping"}]},{"type":"text","text":" 爲例說明 "},{"type":"codeinline","content":[{"type":"text","text":"callbacks"}]},{"type":"text","text":" 的使用方式。本例中,當交叉驗證損失 "},{"type":"codeinline","content":[{"type":"text","text":"val_loss"}]},{"type":"text","text":" 至少在 "},{"type":"codeinline","content":[{"type":"text","text":"2"}]},{"type":"text","text":" 輪 ("},{"type":"codeinline","content":[{"type":"text","text":"epochs"}]},{"type":"text","text":") 訓練中的減少值都低於 "},{"type":"codeinline","content":[{"type":"text","text":"1e-2"}]},{"type":"text","text":" 時,我們會提前停止訓練。其示例代碼如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"callbacks = [\n keras.callbacks.EarlyStopping(\n # Stop training when `val_loss` is no longer improving\n monitor='val_loss',\n # \"no longer improving\" being defined as \"no better than 1e-2 less\"\n min_delta=1e-2,\n # \"no longer improving\" being further defined as \"for at least 2 epochs\"\n patience=2,\n verbose=1,\n )\n]\n\nmodel.fit(\n x_train,\n y_train,\n epochs=20,\n batch_size=64,\n callbacks=callbacks,\n validation_split=0.2,\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.2. 一些比較常用的 "},{"type":"codeinline","content":[{"type":"text","text":"callbacks"}]},{"type":"text","text":" 需要了解並掌握, 如 "},{"type":"codeinline","content":[{"type":"text","text":"ModelCheckpoint"}]},{"type":"text","text":" 用來保存模型權重信息, "},{"type":"codeinline","content":[{"type":"text","text":"TensorBoard"}]},{"type":"text","text":" 用來記錄一些指標信息, "},{"type":"codeinline","content":[{"type":"text","text":"ReduceLROnPlateau"}]},{"type":"text","text":" 用來在模型停滯時減小學習率。更多的 "},{"type":"codeinline","content":[{"type":"text","text":"callbacks"}]},{"type":"text","text":" 函數可以參考 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.callbacks"}]},{"type":"text","text":" 模塊下的實現。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.3. 當然也可以自定義 "},{"type":"codeinline","content":[{"type":"text","text":"callbacks"}]},{"type":"text","text":" 類,該子類需要繼承自 "},{"type":"codeinline","content":[{"type":"text","text":"tf.keras.callbacks.Callback"}]},{"type":"text","text":" 類,並按需實現其內置的方法,比如如果需要在每個 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 訓練結束後記錄 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 的值,則可以使用如下代碼實現:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"class LossHistory(keras.callbacks.Callback):\n def on_train_begin(self, logs):\n self.losses = []\n\n def on_batch_end(self, batch, logs):\n self.losses.append(logs.get('loss'))"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 5.4. 在 "},{"type":"codeinline","content":[{"type":"text","text":"TensorFlow 2.0"}]},{"type":"text","text":" 之前, "},{"type":"codeinline","content":[{"type":"text","text":"ModelCheckpoint"}]},{"type":"text","text":" 內容和 "},{"type":"codeinline","content":[{"type":"text","text":"TensorBoard"}]},{"type":"text","text":" 內容是同時記錄的,保存在相同的文件夾下,而在 "},{"type":"codeinline","content":[{"type":"text","text":"2.0"}]},{"type":"text","text":" 之後的 "},{"type":"codeinline","content":[{"type":"text","text":"keras API"}]},{"type":"text","text":" 中它們可以通過不同的回調函數分開指定。記錄的日誌文件中,含有 "},{"type":"codeinline","content":[{"type":"text","text":"checkpoint"}]},{"type":"text","text":" 關鍵字的文件一般爲檢查點文件,含有 "},{"type":"codeinline","content":[{"type":"text","text":"events.out.tfevents"}]},{"type":"text","text":" 關鍵字的文件一般爲 "},{"type":"codeinline","content":[{"type":"text","text":"Tensorboard"}]},{"type":"text","text":" 相關文件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"多輸入輸出模型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/70/70d09c1e0c3a300efbbe93340db10261.png","alt":"多輸入輸出模型圖","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"1","normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"考慮如圖所示的多輸入多輸出模型,該模型包括兩個輸入和兩個輸出, "},{"type":"codeinline","content":[{"type":"text","text":"score_output"}]},{"type":"text","text":" 輸出表示分值, "},{"type":"codeinline","content":[{"type":"text","text":"class_output"}]},{"type":"text","text":" 輸出表示分類,其示例代碼如下:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"from tensorflow import keras\nfrom tensorflow.keras import layers\n\nimage_input = keras.Input(shape=(32, 32, 3), name='img_input')\ntimeseries_input = keras.Input(shape=(None, 10), name='ts_input')\n\nx1 = layers.Conv2D(3, 3)(image_input)\nx1 = layers.GlobalMaxPooling2D()(x1)\n\nx2 = layers.Conv1D(3, 3)(timeseries_input)\nx2 = layers.GlobalMaxPooling1D()(x2)\n\nx = layers.concatenate([x1, x2])\n\nscore_output = layers.Dense(1, name='score_output')(x)\nclass_output = layers.Dense(5, name='class_output')(x)\n\nmodel = keras.Model(\n inputs=[image_input, timeseries_input],\n outputs=[score_output, class_output],\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"2","normalizeStart":"2"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"在進行模型編譯時,如果只指定一個 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 明顯不能滿足不同輸出的損失計算方式,所以此時可以指定 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 爲一個列表 ("},{"type":"codeinline","content":[{"type":"text","text":"list"}]},{"type":"text","text":"),其中每個元素分別對應於不同的輸出。示例代碼如下:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"model.compile(\n optimizer=keras.optimizers.RMSprop(1e-3),\n loss=[\n keras.losses.MeanSquaredError(),\n keras.losses.CategoricalCrossentropy(from_logits=True)\n ],\n loss_weights=[1, 1],\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此時模型的優化目標爲所有單個損失值的總和,如果想要爲不同的損失指定不同的權重,可以設置 "},{"type":"codeinline","content":[{"type":"text","text":"loss_weights"}]},{"type":"text","text":" 參數,該參數接收一個標量係數列表 ("},{"type":"codeinline","content":[{"type":"text","text":"list"}]},{"type":"text","text":"),用以對模型不同輸出的損失值進行加權。如果僅爲模型指定一個 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" ,則該 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 會應用到每一個輸出,在模型的多個輸出損失計算方式相同時,可以採用這種方式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"3","normalizeStart":"3"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"同樣的對於模型的指標 ("},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":"),也可以指定爲多個,注意因爲 "},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":" 參數本身即爲一個列表,所以爲多個輸出指定 "},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":" 應該使用二維列表。示例代碼如下:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"model.compile(\n optimizer=keras.optimizers.RMSprop(1e-3),\n loss=[\n keras.losses.MeanSquaredError(),\n keras.losses.CategoricalCrossentropy(from_logits=True),\n ],\n metrics=[\n [\n keras.metrics.MeanAbsolutePercentageError(),\n keras.metrics.MeanAbsoluteError()\n ],\n [keras.metrics.CategoricalAccuracy()],\n ],\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"4","normalizeStart":"4"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"對於有明確名稱的輸出,可以通過字典的方式來設置其 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"metrics"}]},{"type":"text","text":"。示例代碼如下:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"model.compile(\n optimizer=keras.optimizers.RMSprop(1e-3),\n loss={\n 'score_output': keras.losses.MeanSquaredError(),\n 'class_output': keras.losses.CategoricalCrossentropy(from_logits=True),\n },\n metrics={\n 'score_output': [\n keras.metrics.MeanAbsolutePercentageError(),\n keras.metrics.MeanAbsoluteError()\n ],\n 'class_output': [\n keras.metrics.CategoricalAccuracy(),\n ]\n },\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"5","normalizeStart":"5"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"對於僅被用來預測的輸出,也可以不指定其 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":"。示例代碼如下:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"model.compile(\n optimizer=keras.optimizers.RMSprop(1e-3),\n loss=[\n None,\n keras.losses.CategoricalCrossentropy(from_logits=True),\n ],\n)\n\n# Or dict loss version\nmodel.compile(\n optimizer=keras.optimizers.RMSprop(1e-3),\n loss={\n 'class_output': keras.losses.CategoricalCrossentropy(from_logits=True),\n },\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"6","normalizeStart":"6"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":6,"align":null,"origin":null},"content":[{"type":"text","text":"對於多輸入輸出模型的訓練來說,也可以採用和其 "},{"type":"codeinline","content":[{"type":"text","text":"compile"}]},{"type":"text","text":" 方法相同的方式來提供數據輸入,也就是說既可以使用列表的方式,也可以使用字典的方式來指定多個輸入。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 6.1. "},{"type":"codeinline","content":[{"type":"text","text":"numpy"}]},{"type":"text","text":" 類型數據示例代碼如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Generate dummy Numpy data\nimg_data = np.random.random_sample(size=(100, 32, 32, 3))\nts_data = np.random.random_sample(size=(100, 20, 10))\nscore_targets = np.random.random_sample(size=(100, 1))\nclass_targets = np.random.random_sample(size=(100, 5))\n\n# Fit on lists\nmodel.fit(\n x=[img_data, ts_data],\n y=[score_targets, class_targets],\n batch_size=32,\n epochs=3,\n)\n\n# Alternatively, fit on dicts\nmodel.fit(\n x={\n 'img_input': img_data,\n 'ts_input': ts_data,\n },\n y={\n 'score_output': score_targets,\n 'class_output': class_targets,\n },\n batch_size=32,\n epochs=3,\n)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 6.2. "},{"type":"codeinline","content":[{"type":"text","text":"dataset"}]},{"type":"text","text":" 類型數據示例代碼如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"# Generate dummy dataset data from numpy\ntrain_dataset = tf.data.Dataset.from_tensor_slices((\n (img_data, ts_data),\n (score_targets, class_targets),\n))\n\n# Alternatively generate with dict\ntrain_dataset = tf.data.Dataset.from_tensor_slices((\n {\n 'img_input': img_data,\n 'ts_input': ts_data,\n },\n {\n 'score_output': score_targets,\n 'class_output': class_targets,\n },\n))\ntrain_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)\n\nmodel.fit(train_dataset, epochs=3)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"自定義訓練流程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"1","normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"如果你不想使用 "},{"type":"codeinline","content":[{"type":"text","text":"model"}]},{"type":"text","text":" 內置提供的 "},{"type":"codeinline","content":[{"type":"text","text":"fit"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"evaluate"}]},{"type":"text","text":" 方法,而想使用低階 "},{"type":"codeinline","content":[{"type":"text","text":"API"}]},{"type":"text","text":" 自定義模型的訓練和評估的流程,則可以藉助於 "},{"type":"codeinline","content":[{"type":"text","text":"GradientTape"}]},{"type":"text","text":" 來實現。深度神經網絡在後向傳播過程中需要計算損失 ("},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":") 關於權重矩陣的導數(也稱爲梯度),以更新權重矩陣並獲得最優解,而 "},{"type":"codeinline","content":[{"type":"text","text":"GradientTape"}]},{"type":"text","text":" 能自動提供求導幫助,無需我們手動求導,它本質上是一個"},{"type":"codeinline","content":[{"type":"text","text":"求導記錄器"}]},{"type":"text","text":" ,能夠記錄前項傳播的過程,並據此計算導數。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"2","normalizeStart":"2"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"模型的構建過程與之前相比沒有什麼不同,主要體現在訓練的部分,示例代碼如下:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"import tensorflow as tf\nfrom tensorflow import keras\nfrom tensorflow.keras import layers\nimport numpy as np\n\n# Get the model.\ninputs = keras.Input(shape=(784, ), name='digits')\nx = layers.Dense(64, activation='relu', name='dense_1')(inputs)\nx = layers.Dense(64, activation='relu', name='dense_2')(x)\noutputs = layers.Dense(10, name='predictions')(x)\nmodel = keras.Model(inputs=inputs, outputs=outputs)\n\n# Instantiate an optimizer.\noptimizer = keras.optimizers.SGD(learning_rate=1e-3)\n# Instantiate a loss function.\nloss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)\n\n# Prepare the metrics.\ntrain_acc_metric = keras.metrics.SparseCategoricalAccuracy()\nval_acc_metric = keras.metrics.SparseCategoricalAccuracy()\n\n# Prepare the training dataset.\nbatch_size = 64\ntrain_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))\ntrain_dataset = train_dataset.shuffle(buffer_size=1024).batch(batch_size)\n\n# Prepare the validation dataset.\nval_dataset = tf.data.Dataset.from_tensor_slices((x_val, y_val))\nval_dataset = val_dataset.batch(64)\n\nepochs = 3\nfor epoch in range(epochs):\n print('Start of epoch %d' % (epoch, ))\n\n # Iterate over the batches of the dataset.\n for step, (x_batch_train, y_batch_train) in enumerate(train_dataset):\n\n # Open a GradientTape to record the operations run\n # during the forward pass, which enables autodifferentiation.\n with tf.GradientTape() as tape:\n\n # Run the forward pass of the layer.\n # The operations that the layer applies\n # to its inputs are going to be recorded\n # on the GradientTape.\n logits = model(x_batch_train,\n training=True) # Logits for this minibatch\n\n # Compute the loss value for this minibatch.\n loss_value = loss_fn(y_batch_train, logits)\n\n # Use the gradient tape to automatically retrieve\n # the gradients of the trainable variables with respect to the loss.\n grads = tape.gradient(loss_value, model.trainable_weights)\n\n # Run one step of gradient descent by updating\n # the value of the variables to minimize the loss.\n optimizer.apply_gradients(zip(grads, model.trainable_weights))\n\n # Update training metric.\n train_acc_metric(y_batch_train, logits)\n\n # Log every 200 batches.\n if step % 200 == 0:\n print('Training loss (for one batch) at step %s: %s' %\n (step, float(loss_value)))\n print('Seen so far: %s samples' % ((step + 1) * 64))\n\n # Display metrics at the end of each epoch.\n train_acc = train_acc_metric.result()\n print('Training acc over epoch: %s' % (float(train_acc), ))\n # Reset training metrics at the end of each epoch\n train_acc_metric.reset_states()\n\n # Run a validation loop at the end of each epoch.\n for x_batch_val, y_batch_val in val_dataset:\n val_logits = model(x_batch_val)\n # Update val metrics\n val_acc_metric(y_batch_val, val_logits)\n val_acc = val_acc_metric.result()\n val_acc_metric.reset_states()\n print('Validation acc: %s' % (float(val_acc), ))"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"3","normalizeStart":"3"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"注意 "},{"type":"codeinline","content":[{"type":"text","text":"with tf.GradientTape() as tape"}]},{"type":"text","text":" 部分的實現,它記錄了前向傳播的過程,然後使用 "},{"type":"codeinline","content":[{"type":"text","text":"tape.gradient"}]},{"type":"text","text":" 方法計算出 "},{"type":"codeinline","content":[{"type":"text","text":"loss"}]},{"type":"text","text":" 關於模型所有權重矩陣 ("},{"type":"codeinline","content":[{"type":"text","text":"model.trainable_weights"}]},{"type":"text","text":") 的導數(也稱作梯度),接着利用優化器 ("},{"type":"codeinline","content":[{"type":"text","text":"optimizer"}]},{"type":"text","text":") 去更新所有的權重矩陣。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"4","normalizeStart":"4"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"在上述訓練流程中,模型的訓練指標在每個 "},{"type":"codeinline","content":[{"type":"text","text":"batch"}]},{"type":"text","text":" 的訓練中進行更新操作 ("},{"type":"codeinline","content":[{"type":"text","text":"update_state()"}]},{"type":"text","text":") ,在一個 "},{"type":"codeinline","content":[{"type":"text","text":"epoch"}]},{"type":"text","text":" 訓練結束後打印指標的結果 ("},{"type":"codeinline","content":[{"type":"text","text":"result()"}]},{"type":"text","text":") ,然後重置該指標 ("},{"type":"codeinline","content":[{"type":"text","text":"reset_states()"}]},{"type":"text","text":") 並進行下一輪的指標記錄,交叉驗證的指標也是同樣的操作。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"5","normalizeStart":"5"},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"注意與使用模型內置 "},{"type":"codeinline","content":[{"type":"text","text":"API"}]},{"type":"text","text":" 進行訓練不同,在自定義訓練中,模型中定義的損失,比如正則化損失以及通過 "},{"type":"codeinline","content":[{"type":"text","text":"add_loss"}]},{"type":"text","text":" 添加的損失,是不會自動累加在 "},{"type":"codeinline","content":[{"type":"text","text":"loss_fn"}]},{"type":"text","text":" 之內的。如果要包含這部分損失,則需要修改自定義訓練的流程,通過調用 "},{"type":"codeinline","content":[{"type":"text","text":"model.losses"}]},{"type":"text","text":" 來將模型的全部損失加入到要優化的損失中去。示例代碼如下所示:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"python"},"content":[{"type":"text","text":"with tf.GradientTape() as tape:\n logits = model(x_batch_train)\n loss_value = loss_fn(y_batch_train, logits)\n\n # Add extra losses created during this forward pass:\n loss_value += sum(model.losses)\ngrads = tape.gradient(loss_value, model.trainable_weights)\noptimizer.apply_gradients(zip(grads, model.trainable_weights))"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"參考資料"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"1","normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://www.tensorflow.org/guide/keras/trainandevaluate","title":"Keras 模型訓練與評估"},"content":[{"type":"text","text":"Keras 模型訓練與評估"}]}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit","title":"Keras 模型 fit 方法"},"content":[{"type":"text","text":"Keras 模型 fit 方法"}]}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://www.tensorflow.org/api_docs/python/tf/data/Dataset","title":"tf.data.Dataset"},"content":[{"type":"text","text":"tf.data.Dataset"}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章