keras文檔快速問答（翻譯自Keras FAQ: Frequently Asked Keras Questions）

本文主要介紹keras的一些常見問題，翻譯自keras文檔，官方文檔在更新，可能會存在不同，具體內容可查看原文地址：https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer

應該引用keras嗎？

如果keras在研究中給與你幫助，請在出版物中引用keras。BibTeX格式如下：

@misc{chollet2015keras,
  title={Keras},
  author={Chollet, Fran\c{c}ois and others},
  year={2015},
  publisher={GitHub},
  howpublished={\url{https://github.com/fchollet/keras}},
}

如何在GPU上運行keras？

如果運行後端爲TensorFlow或者CNTK，當檢測到GPU時，你的代碼會自動使用GPU運行。
如果運行後端爲Theano，可以使用以下方法：
方法1：使用Theano標記。

THEANO_FLAGS=device=gpu,floatX=float32 python my_keras_script.py

根據你的設備識別’gpu’可能需要改變（比如‘gpu0’，‘gpu1’等）。
方法2：設置.theanorc文件。
方法3：在代碼開端手動設置theano.config.device, theano.config.floatX，如下：

import theano
theano.config.device = 'gpu'
theano.config.floatX = 'float32'

“sample”“batch”“epoch”的含義？

下面是正確使用keras需要知道和理解的一些常規定義：
Sample（樣本）：數據集的一個元素。比如，在卷積網絡中一個圖片是一個sample；語音識別模型中一個音頻是一個sample。
Batch：N個sample的集合。batch中的樣本是並行單獨處理的，訓練過程中，一個batch產生一次更新。一個batch比一個單獨的輸入更能近似輸入數據的分佈，batch的規模越大，近似度越高；然而，batch處理時間更長且只更新一次參數。比如，在評估或者預測的時候，建議batch的規模儘可能大（不超過電腦內存就行），可以提升評估或者預測速度。
Epoch：一個截止，一般定義爲：一次處理完整個數據集。把訓練過程分爲不同的階段，有利於記錄和週期評估；當使用evaluation_data 或者evaluation_split時，評估會在每次epoch之後進行。在keras中，在每次epoch之後，可以特別添加callbacks，比如學習率的改變、模型檢查、模型保存等。

如何保存keras模型？

不建議使用 pickle 或者 cPickle保存keras模型。
可以使用model.save(filepath)以單獨的HDF5文件保存keras模型，包含以下信息：
模型的結構，允許重建模型；
模型的權重；
訓練配置：損失，優化器等；
優化器的狀態，允許重新訓練中斷的進程。
可以使用keras.models.load_model(filepath)加載模型。load_model會使用保存的配置編譯模型。
Example：

from keras.models import load_model

model.save('my_model.h5')  # creates a HDF5 file 'my_model.h5'
del model  # deletes the existing model

# returns a compiled model
# identical to the previous one
model = load_model('my_model.h5')

如果只需要保存模型的結構，不需要訓練配置和權重，可以這樣做：

# save as JSON
json_string = model.to_json()

# save as YAML
yaml_string = model.to_yaml()

生成的JSON/YAML文件是人工可讀的且能夠手動編輯。利用這些數據可以建立新的模型：

# model reconstruction from JSON:
from keras.models import model_from_json
model = model_from_json(json_string)

# model reconstruction from YAML
from keras.models import model_from_yaml
model = model_from_yaml(yaml_string)

如果需要保存模型權重，可以使用HDF5文件（先安裝HDF5和Python庫h5py）：

model.save_weights('my_model_weights.h5')

可以應用具有相同結構的模型加載權重參數：

model.load_weights('my_model_weights.h5')

如果需要加載權重到一個不同結構的模型（有一些相同的層），比如fine-tuning或者遷移學習，可以通過 layer name加載權重：

model.load_weights('my_model_weights.h5', by_name=True)

舉例：

"""
Assume original model looks like this:
    model = Sequential()
    model.add(Dense(2, input_dim=3, name='dense_1'))
    model.add(Dense(3, name='dense_2'))
    ...
    model.save_weights(fname)
"""

# new model
model = Sequential()
model.add(Dense(2, input_dim=3, name='dense_1'))  # will be loaded
model.add(Dense(10, name='new_dense'))  # will not be loaded

# load weights from first model; will only affect the first layer, dense_1.
model.load_weights(fname, by_name=True)

爲什麼訓練損失比測試損失高？

keras模型有兩種模式：訓練和測試。正則化方法比如DropOut和L1/L2權重正則化，在測試階段是沒有的。
而且，訓練損失是訓練數據每個batch的平均損失。因爲模型一直在變化，第一個batch的損失一般比最後一個batch高；另一方面，測試損失是應用最後一個batch後的模型計算得到，產生的損失較小。

如何得到keras中間層的輸出？

一個簡單的方法是創建新模型，輸出感興趣的層：

from keras.models import Model

model = ...  # create the original model

layer_name = 'my_layer'
intermediate_layer_model = Model(inputs=model.input,
                                 outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model.predict(data)

另外一種選擇，可以創建一個keras function函數，返回特定層的輸出，比如：

from keras import backend as K

# with a Sequential model
get_3rd_layer_output = K.function([model.layers[0].input],
                                  [model.layers[3].output])
layer_output = get_3rd_layer_output([x])[0]

相似的，也可以創建一個Theano和TensorFlow function。
注意模型在訓練和測試階段有不同的表現（比如是否使用DropOut，BatchNormalization），需要在function中通過learning phase flag：

get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()],[model.layers[3].output])

# output in test mode = 0
layer_output = get_3rd_layer_output([x, 0])[0]

# output in train mode = 1
layer_output = get_3rd_layer_output([x, 1])[0]

沒有相應的內存載入數據如何使用keras？

使用批訓練：model.train_on_batch(x, y)和model.test_on_batch(x, y).
此外，可以寫一個生成器來產生訓練數據的batches，使用以下方法：model.fit_generator(data_generator, steps_per_epoch, epochs).
批訓練方法可以在例子中查看：CIFAR10 example.https://github.com/fchollet/keras/blob/master/examples/cifar10_cnn.py

當驗證集損失不再下降的時候如何中段訓練進程？

使用EarlyStopping callback:

from keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(monitor='val_loss', patience=2)
model.fit(x, y, validation_split=0.2, callbacks=[early_stopping])

驗證集validation split是如何計算的？

如果在model.fit中設置了validation_split，比如等於0.1，那麼驗證集數據使用的是數據集最後10%；如果設置爲0.25，使用數據集最後25%，以此類推。注意在提取驗證集之前數據並沒有被打亂，因此驗證集就是輸入數據的最後x%的樣本。
對於所有的迭代，有相同的驗證集。

訓練過程中數據是打亂的嗎？

是的，如果shuffle 設置爲True（默認值），訓練集數據在每次迭代中被隨機打亂。
驗證集不會被打亂。

每次迭代如何記錄訓練/驗證損失/正確率？

model.fit會返回一個History，有屬性值history，包含了一些列的損失及其他值。

hist = model.fit(x, y, validation_split=0.2)
print(hist.history)

如何凍結（freeze）keras層？

凍結某一層意味着從訓練中將其移除，比如其權值不會再更新。這個方法在fine-tuning一個模型的時候很有用，或者對於文本輸入使用fixed embeddings時。
通過trainable（布爾值）對特定層進行構造，使某一層不可被訓練：

frozen_layer = Dense(32, trainable=False)

除此之外，可以在實例化之後設置trainable的屬性值爲True或者False，需要在修正trainable屬性值之後編譯模型。例如：

x = Input(shape=(32,))
layer = Dense(32)
layer.trainable = False
y = layer(x)

frozen_model = Model(x, y)
# in the model below, the weights of `layer` will not be updated during training
frozen_model.compile(optimizer='rmsprop', loss='mse')

layer.trainable = True
trainable_model = Model(x, y)
# with this model the weights of the layer will be updated during training
# (which will also affect the above model since it uses the same layer instance)
trainable_model.compile(optimizer='rmsprop', loss='mse')

frozen_model.fit(data, labels)  # this does NOT update the weights of `layer`
trainable_model.fit(data, labels)  # this updates the weights of `layer`

如何從序貫模型中移除其中一層？

在序貫模型中，可以通過.pop()移除最後一個添加的層：

model = Sequential()
model.add(Dense(32, activation='relu', input_dim=784))
model.add(Dense(32, activation='relu'))

print(len(model.layers))  # "2"

model.pop()
print(len(model.layers))  # "1"

在keras中如何使用預訓練模型？

對於以下的圖像分類模型，代碼和預訓練參數是可獲取的：
Xception；
VGG16；
VGG19；
ResNet50；
Inception v3；
它們可以通過keras.applications模塊被載入：

from keras.applications.xception import Xception
from keras.applications.vgg16 import VGG16
from keras.applications.vgg19 import VGG19
from keras.applications.resnet50 import ResNet50
from keras.applications.inception_v3 import InceptionV3

model = VGG16(weights='imagenet', include_top=True)

對於一些簡單的使用樣例，可以參考Applications模塊文檔：https://keras.io/applications/
對於如何應用預訓練模型進行特徵提取或者fine-tuning，可以參考一下博客：https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
VGG16模型是一些keras例子的基礎：
Style transfer：
https://github.com/fchollet/keras/blob/master/examples/neural_style_transfer.py
Feature visualization
https://github.com/fchollet/keras/blob/master/examples/conv_filter_visualization.py
Deep dream
https://github.com/fchollet/keras/blob/master/examples/deep_dream.py

keras如何使用HDF5輸入？

可以在keras.utils.io_utils使用HDF5Matrix類。
https://keras.io/utils/#hdf5matrix
也可以直接使用HDF5數據集：

import h5py
with h5py.File('input/file.hdf5', 'r') as f:
    x_data = f['x_data']
    model.predict(x_data)

keras文檔快速問答（翻譯自Keras FAQ: Frequently Asked Keras Questions）

應該引用keras嗎？

如何在GPU上運行keras？

“sample”“batch”“epoch”的含義？

如何保存keras模型？

爲什麼訓練損失比測試損失高？

如何得到keras中間層的輸出？

沒有相應的內存載入數據如何使用keras？

當驗證集損失不再下降的時候如何中段訓練進程？

驗證集validation split是如何計算的？

訓練過程中數據是打亂的嗎？

每次迭代如何記錄訓練/驗證損失/正確率？

如何凍結（freeze）keras層？

如何從序貫模型中移除其中一層？

在keras中如何使用預訓練模型？

keras如何使用HDF5輸入？

物理機開關機

YOLO1論文解讀：You Only Look Once: Unified, Real-Time Object Detection

YOLO2論文解讀：YOLO9000: Better, Faster, Stronger

keras文檔快速問答（翻譯自Keras FAQ: Frequently Asked Keras Questions）

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結