Bi-LSTM-CRF（二）--tensorflow源碼解析

原創

2018-10-06 09:27

CRF

對於序列標註問題，通常會在LSTM的輸出後接一個CRF層：將LSTM的輸出通過線性變換得到維度爲[batch_size, max_seq_len, num_tags]的張量，這個張量再作爲一元勢函數（Unary Potentials）輸入到CRF層。

# 將兩個LSTM的輸出合併

output_fw, output_bw = outputs

output = tf.concat([output_fw, output_bw], axis=-1)

# 變換矩陣，可訓練參數

W = tf.get_variable("W", [2 * num_units, num_tags])

# 線性變換

matricized_output = tf.reshape(output, [-1, 2 * num_units])

matricized_unary_scores = tf.matmul(matricized_output , W)

unary_scores = tf.reshape(matricized_unary_scores, [batch_size, max_seq_len, num_tags])

2.1.損失函數

# Loss函數

log_likelihood, transition_params =

tf.contrib.crf.crf_log_likelihood(unary_scores,tags,sequence_lengths)

loss = tf.reduce_mean(-log_likelihood)

其中
tags：維度爲[batch_size, max_seq_len]的矩陣，也就是Golden標籤，注意這裏的標籤都是以索引方式表示的。
sequence_lengths：維度爲[batch_size]的向量，記錄了每個序列的長度。

log_likelihood：維度爲[batch_size]的向量，每個元素代表每個給定序列的Log-Likelihood。
transition_params ：維度爲[num_tags, num_tags]的轉移矩陣。注意這裏的轉移矩陣不像傳統的HMM概率轉移矩陣那樣要求每個元素非負且每一行的和爲1，這裏的每個元素取值範圍是實數（正負都可以）。

2.2.解碼

decode_tags, best_score =

tf.contrib.crf.crf_decode(unary_scores,transition_params,sequence_lengths)

其中
decode_tags：維度爲[batch_size, max_seq_len]的矩陣，包含最高分的標籤序列。
best_score ：維度爲[batch_size]的向量，包含最高分數。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Bi-LSTM-CRF（二）--tensorflow源碼解析

CRF

2.1.損失函數

2.2.解碼

國內計算機學科十大名校

保研路--雙非到八所985高校錄取的坎坷歷程

Bi-LSTM-CRF（一）--tensorflow源碼解析

正則化

Bi-LSTM-CRF（二）--tensorflow源碼解析

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結