TensorFlow初始化LSTM参数weight 和 bias

                  TensorFlow 初始化 LSTM 参数 weight 和 bias

前言:

前一篇博客介绍了如何可视化神经网络的每一层,很简单的做法就是将训练好数据作为神经网络的初始化参数进行前向传播。在LSTM中我们可以从官方文档看到能初始化的参数只有weight。在可视化时我们往往需要传入weight和bias,或者载入模型参数继续训练也需要载入w和b。

初始化LSTM的weight

初始化w比较简单,设置为一个常量传入即可,这里get_parameter函数为将模型参数读入到numpy,可参考上一篇博客。因为是两层的LSTM每层参数不一样分开两次写入,用constant赋值w。

multi_rnn_cells = [get_parameter(model_dir,'generator_model/rnn/multi_rnn_cell/cell_0/lstm_cell/kernel'),
                      get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_0/lstm_cell/bias'),
                      get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_1/lstm_cell/kernel'),
                      get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_1/lstm_cell/bias')]
   # rnn_layers = [tf.nn.rnn_cell.LSTMCell(size) for size in [rnn_size] * rnn_num_layers]
    rnn_layers = [tf.nn.rnn_cell.LSTMCell(64,initializer=tf.constant_initializer(multi_rnn_cells[0], tf.float32)),
                 tf.nn.rnn_cell.LSTMCell(64, initializer=tf.constant_initializer(multi_rnn_cells[2], tf.float32))]

我们可以通过以下代码查看当前weight和bias里面的参数是否设置成功

    init = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
    sess = tf.Session()
    sess.run(init)
    tv = [v.name for v in tf.trainable_variables()]
    outs = sess.run([tv, outputs])
    print("output: ",outs)

可以看到我们的weight已经赋值成功了,在LSTM中bias是以全0初始化的未赋值

初始化LSTM的bias

初始化bias我们可以用tf.assign来完成赋值。

先查看以下当前LSTM中weight,bias的变量名称通过如下代码,可以看到bias变量名称为‘rnn/multi_rnn_cell/cell_0/lstm_cell/bias’的形式

init = tf.group(tf.global_variables_initializer(),tf.local_variables_initializer())
sess = tf.Session()
sess.run(init)
tv = [v.name for v in tf.trainable_variables()]
print("tv: ", tv)


result : 
tv:  ['rnn/multi_rnn_cell/cell_0/lstm_cell/kernel:0',
 'rnn/multi_rnn_cell/cell_0/lstm_cell/bias:0', 
'rnn/multi_rnn_cell/cell_1/lstm_cell/kernel:0', 
'rnn/multi_rnn_cell/cell_1/lstm_cell/bias:0']

通过get_variable来拿到当前变量,并且用tf.assign来对bias赋值。

    with tf.Session() as sess:
        init = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
        sess.run(init)
        with tf.variable_scope('rnn', reuse=tf.AUTO_REUSE):
            a = tf.get_variable('multi_rnn_cell/cell_0/lstm_cell/bias', shape=multi_rnn_cells[1].shape)
            b = tf.get_variable('multi_rnn_cell/cell_1/lstm_cell/bias', shape=multi_rnn_cells[3].shape)
            print(a, "variable")
            a = tf.assign(a, multi_rnn_cells[1])
            b = tf.assign(b, multi_rnn_cells[3])
        sess.run(a)
        sess.run(b)
        tvars = sess.run([tv, outputs])
    print(tv)
    print("output: ", tvars)

可以看到目前bias也成功赋值了。我们需要注意的是需要先run出a和b这两个赋值操作。

完整代码

    multi_rnn_cells = [get_parameter(model_dir,'generator_model/rnn/multi_rnn_cell/cell_0/lstm_cell/kernel'),
                      get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_0/lstm_cell/bias'),
                      get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_1/lstm_cell/kernel'),
                      get_parameter(model_dir, 'generator_model/rnn/multi_rnn_cell/cell_1/lstm_cell/bias')]
    print(multi_rnn_cells[0].shape, multi_rnn_cells[1].shape, multi_rnn_cells[2].shape, multi_rnn_cells[3].shape)
    # rnn_layers = [tf.nn.rnn_cell.LSTMCell(size) for size in [rnn_size] * rnn_num_layers]
    rnn_layers = [tf.nn.rnn_cell.LSTMCell(64,initializer=tf.constant_initializer(multi_rnn_cells[0], tf.float32)),
                 tf.nn.rnn_cell.LSTMCell(64, initializer=tf.constant_initializer(multi_rnn_cells[2], tf.float32))]
    multi_rnn_cell = tf.nn.rnn_cell.MultiRNNCell(rnn_layers)
    outputs, state = tf.nn.dynamic_rnn(cell=multi_rnn_cell, inputs=reshape_1, dtype=tf.float32)
    with tf.Session() as sess:
        init = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
        sess.run(init)
        with tf.variable_scope('rnn', reuse=tf.AUTO_REUSE):
            a = tf.get_variable('multi_rnn_cell/cell_0/lstm_cell/bias', shape=multi_rnn_cells[1].shape)
            b = tf.get_variable('multi_rnn_cell/cell_1/lstm_cell/bias', shape=multi_rnn_cells[3].shape)
            print(a, "variable")
            a = tf.assign(a, multi_rnn_cells[1])
            b = tf.assign(b, multi_rnn_cells[3])
        sess.run(a)
        sess.run(b)
        tvars = sess.run([tv, outputs])
    print(tv)
    print("output: ", tvars)

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章