TensorFlow初始化LSTM參數weight 和 bias

                  TensorFlow 初始化 LSTM 參數 weight 和 bias

前言:

前一篇博客介紹瞭如何可視化神經網絡的每一層,很簡單的做法就是將訓練好數據作爲神經網絡的初始化參數進行前向傳播。在LSTM中我們可以從官方文檔看到能初始化的參數只有weight。在可視化時我們往往需要傳入weight和bias,或者載入模型參數繼續訓練也需要載入w和b。

初始化LSTM的weight

初始化w比較簡單,設置爲一個常量傳入即可,這裏get_parameter函數爲將模型參數讀入到numpy,可參考上一篇博客。因爲是兩層的LSTM每層參數不一樣分開兩次寫入,用constant賦值w。

multi_rnn_cells = [get_parameter(model_dir,'generator_model/rnn/multi_rnn_cell/cell_0/lstm_cell/kernel'),
                      get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_0/lstm_cell/bias'),
                      get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_1/lstm_cell/kernel'),
                      get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_1/lstm_cell/bias')]
   # rnn_layers = [tf.nn.rnn_cell.LSTMCell(size) for size in [rnn_size] * rnn_num_layers]
    rnn_layers = [tf.nn.rnn_cell.LSTMCell(64,initializer=tf.constant_initializer(multi_rnn_cells[0], tf.float32)),
                 tf.nn.rnn_cell.LSTMCell(64, initializer=tf.constant_initializer(multi_rnn_cells[2], tf.float32))]

我們可以通過以下代碼查看當前weight和bias裏面的參數是否設置成功

    init = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
    sess = tf.Session()
    sess.run(init)
    tv = [v.name for v in tf.trainable_variables()]
    outs = sess.run([tv, outputs])
    print("output: ",outs)

可以看到我們的weight已經賦值成功了,在LSTM中bias是以全0初始化的未賦值

初始化LSTM的bias

初始化bias我們可以用tf.assign來完成賦值。

先查看以下當前LSTM中weight,bias的變量名稱通過如下代碼,可以看到bias變量名稱爲‘rnn/multi_rnn_cell/cell_0/lstm_cell/bias’的形式

init = tf.group(tf.global_variables_initializer(),tf.local_variables_initializer())
sess = tf.Session()
sess.run(init)
tv = [v.name for v in tf.trainable_variables()]
print("tv: ", tv)


result : 
tv:  ['rnn/multi_rnn_cell/cell_0/lstm_cell/kernel:0',
 'rnn/multi_rnn_cell/cell_0/lstm_cell/bias:0', 
'rnn/multi_rnn_cell/cell_1/lstm_cell/kernel:0', 
'rnn/multi_rnn_cell/cell_1/lstm_cell/bias:0']

通過get_variable來拿到當前變量,並且用tf.assign來對bias賦值。

    with tf.Session() as sess:
        init = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
        sess.run(init)
        with tf.variable_scope('rnn', reuse=tf.AUTO_REUSE):
            a = tf.get_variable('multi_rnn_cell/cell_0/lstm_cell/bias', shape=multi_rnn_cells[1].shape)
            b = tf.get_variable('multi_rnn_cell/cell_1/lstm_cell/bias', shape=multi_rnn_cells[3].shape)
            print(a, "variable")
            a = tf.assign(a, multi_rnn_cells[1])
            b = tf.assign(b, multi_rnn_cells[3])
        sess.run(a)
        sess.run(b)
        tvars = sess.run([tv, outputs])
    print(tv)
    print("output: ", tvars)

可以看到目前bias也成功賦值了。我們需要注意的是需要先run出a和b這兩個賦值操作。

完整代碼

    multi_rnn_cells = [get_parameter(model_dir,'generator_model/rnn/multi_rnn_cell/cell_0/lstm_cell/kernel'),
                      get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_0/lstm_cell/bias'),
                      get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_1/lstm_cell/kernel'),
                      get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_1/lstm_cell/bias')]
    print(multi_rnn_cells[0].shape, multi_rnn_cells[1].shape, multi_rnn_cells[2].shape, multi_rnn_cells[3].shape)
    # rnn_layers = [tf.nn.rnn_cell.LSTMCell(size) for size in [rnn_size] * rnn_num_layers]
    rnn_layers = [tf.nn.rnn_cell.LSTMCell(64,initializer=tf.constant_initializer(multi_rnn_cells[0], tf.float32)),
                 tf.nn.rnn_cell.LSTMCell(64, initializer=tf.constant_initializer(multi_rnn_cells[2], tf.float32))]
    multi_rnn_cell = tf.nn.rnn_cell.MultiRNNCell(rnn_layers)
    outputs, state = tf.nn.dynamic_rnn(cell=multi_rnn_cell, inputs=reshape_1, dtype=tf.float32)
    with tf.Session() as sess:
        init = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
        sess.run(init)
        with tf.variable_scope('rnn', reuse=tf.AUTO_REUSE):
            a = tf.get_variable('multi_rnn_cell/cell_0/lstm_cell/bias', shape=multi_rnn_cells[1].shape)
            b = tf.get_variable('multi_rnn_cell/cell_1/lstm_cell/bias', shape=multi_rnn_cells[3].shape)
            print(a, "variable")
            a = tf.assign(a, multi_rnn_cells[1])
            b = tf.assign(b, multi_rnn_cells[3])
        sess.run(a)
        sess.run(b)
        tvars = sess.run([tv, outputs])
    print(tv)
    print("output: ", tvars)

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章