Keras多gpu训练模型后权重文件无法在cpu或者单gpu机器使用的问题-Effective DeepLearning

原創

2020-06-24 17:32

最近组内添置了一个多gpu的机器，欢天洗地的膜拜之后丝毫不敢怠慢的用来训练分类模型。速度提升了，时间自然节约了不少，正在欢喜之间猛然发现训练出来的权重文件在本地的cpu机器加载的时候报错无法用做预测。报错信息如下：

2018-07-19 18:07:21.957595: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
File "/Users/ICD/Documents/source/mofang_UserActivity/UserActivityAnlyse/branches/tensorflow/server/vehicle_django/util/test.py", line 105, in <module>
verify_h5model(args)
File "/Users/ICD/Documents/source/mofang_UserActivity/UserActivityAnlyse/branches/tensorflow/server/vehicle_django/util/test.py", line 42, in verify_h5model
inceptionV3_model.load_weights(weights_path, by_name=True)
File"/Users/ICD/Documents/workspace/vm/tensorflow_imageRetrain/lib/python3.6/site-packages/keras/engine/network.py", line 1177, in load_weights
reshape=reshape)
File"/Users/ICD/Documents/workspace/vm/tensorflow_imageRetrain/lib/python3.6/site-packages/keras/engine/saving.py", line 1000, in load_weights_from_hdf5_group_by_name
' element(s).')
ValueError: Layer #4 (named "predictions") expects 2 weight(s), but the saved weights have 0 element(s).

按照报错信息， Layer #4 (named "predictions") expects 2 weight(s), but the saved weights have 0 element(s).本来是要两个权重信息但是现在却一个都找不到，两个权重？！刚好新机器也是两个gpu的，所以联系到跟gpu相关。我们训练使用的是keras。苦逼的看了一天的Keras代码，发现在Keras版的训练它由多个输出支路，也就是多个loss，在网络定义的时候一般会给命名，然后编译的时候找到不同支路layer的名字，这些名字在单核训练下一般是固定格式的，例如main_output和aux_output就是认为定义的layer name，但是如果用了keras.utils.training_utils.multi_gpu_model()以后，名字就自动换掉了，变成默认的concatenate_1, concatenate_2等等。

那么当训练完你在做预测的时候在单gpu或cpu的情况下使用该权重文件的时候，在权重文件里面全是xx_1,xx_2的路径，keras自然拿不到对应的权重名那就木有办法再往后进行了.........

======================================解决方法==================================================

首先你阅读一下keras multi_gpu_model()这个方法你就知道，它实际是按你定义的gpu=n这个参数和你机器实际拥有的gpu核数来定义那些layer的名称的。所以这里我想到的办法是现在训练的双核gpu的机器上把权重load出来再保存为不区分核数的tensorflow模型文件，酱紫就可以在其他单核机器使用了。导出文件的代码如下。这是一个曲线救国的方法，大家如果有其他直接了当的解决办法麻烦告知

def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
graph = session.graph
with graph.as_default():
freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
output_names = output_names or []
output_names += [v.op.name for v in tf.global_variables()]
input_graph_def = graph.as_graph_def()
if clear_devices:
for node in input_graph_def.node:
node.device = ""
frozen_graph = convert_variables_to_constants(session, input_graph_def,
output_names, freeze_var_names)
return frozen_graph
input_fld = sys.path[0]
weights_path = "/InceptionV3_best_vehicleModel_41.h5"
output_graph_name = 'vehicle_model_gpu_2.pb'

output_fld = input_fld + '/tensorflow_model/'

with tf.device("/cpu:0"):
inception = InceptionV3(include_top=False, weights=None,
input_tensor=None, pooling='avg', input_shape=(299, 299, 3))
output = inception.get_layer(index=-1).output # shape=(None, 1, 1, 2048)
output = Dense(1024, name='features')(output)
# output = Dense(41, activation='softmax', name='predictions_request')(output)
output = Dense(41, activation='softmax', name='predictions')(output)
inception3_model = Model(outputs=output, inputs=inception.input)

inception3_model = multi_gpu_model(inception3_model, gpus=2)
inception3_model.load_weights(weights_path, by_name=True)

print('input is :', inception3_model.input.name)
print('output is:', inception3_model.output.name)

sess = K.get_session()

frozen_graph = freeze_session(K.get_session(), output_names=[inception3_model.output.op.name])

graph_io.write_graph(frozen_graph, output_fld, output_graph_name, as_text=False)

print('saved the constant graph (ready for inference) at: ', os.path.join(output_fld, output_graph_name))

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Keras多gpu训练模型后权重文件无法在cpu或者单gpu机器使用的问题-Effective DeepLearning

Win10 LTSC 2019 安装后的一些步骤

推荐2款开源、美观的WinForm UI控件库

NET9 AspnetCore将整合OpenAPI的文档生成功能而无需三方库

在Linux下管理MySQL的大小写敏感性

Keras多gpu訓練模型後權重文件無法在cpu或者單gpu機器使用的問題-Effective DeepLearning

keras分佈式訓練模型 openMpi+Horovod+keras -Effective DeepLearning

Keras實現遷移學習-Effective DeepLearning

圖像識別中的P-R曲線是如何產生的-Effective DeepLearning

CUDA安裝踩坑指南-Effective DeepLearning

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結