Keras多gpu訓練模型後權重文件無法在cpu或者單gpu機器使用的問題-Effective DeepLearning

原創

2020-06-24 17:32

最近組內添置了一個多gpu的機器，歡天洗地的膜拜之後絲毫不敢怠慢的用來訓練分類模型。速度提升了，時間自然節約了不少，正在歡喜之間猛然發現訓練出來的權重文件在本地的cpu機器加載的時候報錯無法用做預測。報錯信息如下：

2018-07-19 18:07:21.957595: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
File "/Users/ICD/Documents/source/mofang_UserActivity/UserActivityAnlyse/branches/tensorflow/server/vehicle_django/util/test.py", line 105, in <module>
verify_h5model(args)
File "/Users/ICD/Documents/source/mofang_UserActivity/UserActivityAnlyse/branches/tensorflow/server/vehicle_django/util/test.py", line 42, in verify_h5model
inceptionV3_model.load_weights(weights_path, by_name=True)
File"/Users/ICD/Documents/workspace/vm/tensorflow_imageRetrain/lib/python3.6/site-packages/keras/engine/network.py", line 1177, in load_weights
reshape=reshape)
File"/Users/ICD/Documents/workspace/vm/tensorflow_imageRetrain/lib/python3.6/site-packages/keras/engine/saving.py", line 1000, in load_weights_from_hdf5_group_by_name
' element(s).')
ValueError: Layer #4 (named "predictions") expects 2 weight(s), but the saved weights have 0 element(s).

按照報錯信息， Layer #4 (named "predictions") expects 2 weight(s), but the saved weights have 0 element(s).本來是要兩個權重信息但是現在卻一個都找不到，兩個權重？！剛好新機器也是兩個gpu的，所以聯繫到跟gpu相關。我們訓練使用的是keras。苦逼的看了一天的Keras代碼，發現在Keras版的訓練它由多個輸出支路，也就是多個loss，在網絡定義的時候一般會給命名，然後編譯的時候找到不同支路layer的名字，這些名字在單核訓練下一般是固定格式的，例如main_output和aux_output就是認爲定義的layer name，但是如果用了keras.utils.training_utils.multi_gpu_model()以後，名字就自動換掉了，變成默認的concatenate_1, concatenate_2等等。

那麼當訓練完你在做預測的時候在單gpu或cpu的情況下使用該權重文件的時候，在權重文件裏面全是xx_1,xx_2的路徑，keras自然拿不到對應的權重名那就木有辦法再往後進行了.........

======================================解決方法==================================================

首先你閱讀一下keras multi_gpu_model()這個方法你就知道，它實際是按你定義的gpu=n這個參數和你機器實際擁有的gpu核數來定義那些layer的名稱的。所以這裏我想到的辦法是現在訓練的雙核gpu的機器上把權重load出來再保存爲不區分核數的tensorflow模型文件，醬紫就可以在其他單核機器使用了。導出文件的代碼如下。這是一個曲線救國的方法，大家如果有其他直接了當的解決辦法麻煩告知

def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
graph = session.graph
with graph.as_default():
freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
output_names = output_names or []
output_names += [v.op.name for v in tf.global_variables()]
input_graph_def = graph.as_graph_def()
if clear_devices:
for node in input_graph_def.node:
node.device = ""
frozen_graph = convert_variables_to_constants(session, input_graph_def,
output_names, freeze_var_names)
return frozen_graph
input_fld = sys.path[0]
weights_path = "/InceptionV3_best_vehicleModel_41.h5"
output_graph_name = 'vehicle_model_gpu_2.pb'

output_fld = input_fld + '/tensorflow_model/'

with tf.device("/cpu:0"):
inception = InceptionV3(include_top=False, weights=None,
input_tensor=None, pooling='avg', input_shape=(299, 299, 3))
output = inception.get_layer(index=-1).output # shape=(None, 1, 1, 2048)
output = Dense(1024, name='features')(output)
# output = Dense(41, activation='softmax', name='predictions_request')(output)
output = Dense(41, activation='softmax', name='predictions')(output)
inception3_model = Model(outputs=output, inputs=inception.input)

inception3_model = multi_gpu_model(inception3_model, gpus=2)
inception3_model.load_weights(weights_path, by_name=True)

print('input is :', inception3_model.input.name)
print('output is:', inception3_model.output.name)

sess = K.get_session()

frozen_graph = freeze_session(K.get_session(), output_names=[inception3_model.output.op.name])

graph_io.write_graph(frozen_graph, output_fld, output_graph_name, as_text=False)

print('saved the constant graph (ready for inference) at: ', os.path.join(output_fld, output_graph_name))

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Keras多gpu訓練模型後權重文件無法在cpu或者單gpu機器使用的問題-Effective DeepLearning

vue綁定對象，綁定的值不改變的問題

詐騙（殺豬盤）網站進行滲透測試

Spring Cloud 部署時如何使用 Kubernetes 作爲註冊中心和配置中心

KubeKey 部署 K8s v1.28.8 實戰

記一些CISP-PTE題目解析

Keras多gpu訓練模型後權重文件無法在cpu或者單gpu機器使用的問題-Effective DeepLearning

keras分佈式訓練模型 openMpi+Horovod+keras -Effective DeepLearning

Keras實現遷移學習-Effective DeepLearning

圖像識別中的P-R曲線是如何產生的-Effective DeepLearning

CUDA安裝踩坑指南-Effective DeepLearning

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結