tensorflow訓練越來越慢的解決辦法——重置/清空計算圖

原創

2020-06-14 12:40

在tensorflow訓練中，每組訓練速度越來越慢，時長越來越高、從運行日誌中可以看到：每個視頻花費時間從9s到165s、到207s。這樣每組數據有81個視頻、導致每組數據訓練時長從12分鐘變到3小時變到4小時（一開始只需要十幾分鍾）~~雖然還差幾組數據就訓練完了，但這速度這簡直不能忍

#·····································運行日誌1····································#

100%|##################################################################################| 81/81 [12:56<00:00,  9.59s/it]
Predicting on frame of L:\collect_program\sign_language\sign-language-gesture-recognition-master\dataset\frames\train-frames\s10-Enemy

100%|##################################################################################| 81/81 [13:54<00:00, 10.30s/it]
Predicting on frame of L:\collect_program\sign_language\sign-language-gesture-recognition-master\dataset\frames\train-frames\s11-Son

100%|##################################################################################| 81/81 [16:42<00:00, 12.38s/it]
Predicting on frame of L:\collect_program\sign_language\sign-language-gesture-recognition-master\dataset\frames\train-frames\s12-Man


~~~~~~~省略·~~~~~~

100%|###############################################################################| 81/81 [3:29:29<00:00, 155.18s/it]
Predicting on frame of s3-Green

100%|###############################################################################| 81/81 [3:43:52<00:00, 165.83s/it]
Predicting on frame of s30-Birthday

100%|###############################################################################| 81/81 [3:55:48<00:00, 174.67s/it]
Predicting on frame of s31-Breakfast

100%|###############################################################################| 81/81 [4:08:02<00:00, 183.74s/it]
Predicting on frame of s32-Photo

100%|###############################################################################| 81/81 [4:21:15<00:00, 193.53s/it]
Predicting on frame of s33-Hungry

100%|###############################################################################| 81/81 [4:39:47<00:00, 207.26s/it]
Predicting on frame ofs 34-Map

#··················································································#

重新看程序，發現程序中，沒有用feed和fetch，沒有用操作數和佔位符轉遞數據，導致計算圖graph越來越大，這也就造成訓練時間指數性增加。

解決方法1：嚴格按照tensorflow標準，用feed和fetch傳遞數據（沒有實驗）

解決方法2：每組數據訓練完畢後，清空重置計算圖（親測可行）

def load_graph(model_file):
    graph = tf.Graph()
    graph_def = tf.GraphDef()

    with open(model_file, "rb") as f:
        graph_def.ParseFromString(f.read())
    with graph.as_default():
        tf.import_graph_def(graph_def)

    return graph


#·····································#
tf.reset_default_graph()
graph = load_graph(model_file)
print("reset graph")

修改後的運行日誌：

100%|##################################################################################| 81/81 [12:06<00:00,  8.97s/it]
Predicting on frame of s10-Enemy

reset graph
100%|##################################################################################| 81/81 [11:50<00:00,  8.77s/it]
Predicting on frame of s11-Son

reset graph
100%|##################################################################################| 81/81 [11:49<00:00,  8.76s/it]
Predicting on frame of s12-Man

reset graph
100%|##################################################################################| 81/81 [11:56<00:00,  8.85s/it]
Predicting on frame of s13-Away

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

tensorflow訓練越來越慢的解決辦法——重置/清空計算圖

win10下cuda10.0對應tensorflow、pytorch版本

Tensorflow入門筆記——一、常量與變量操作數與佔位符

常用路徑命令 tensorflow環境

tensorflow訓練越來越慢的解決辦法——重置/清空計算圖

設置每個GPU內存佔用率，

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結