Tensorflow1.4.0實現條件隨機場（CRF）

原創

2018-08-22 04:43

關於TensorFlow實現CRF的方法我在網上找了很久也沒有找到很合適的，目前最多關注的是自己寫出來的CRF，比較複雜。在翻閱TensorFlow文檔的時候偶然間發現TensorFlow1.4.0版本已經實現了CRF，並找到了官方例程，實現簡單，在這裏跟大家分享一下

import numpy as np
import tensorflow as tf

# 參數設置
num_examples = 10
num_words = 20
num_features = 100
num_tags = 5

# 構建隨機特徵
x = np.random.rand(num_examples, num_words, num_features).astype(np.float32)

# 構建隨機tag
y = np.random.randint(
    num_tags, size=[num_examples, num_words]).astype(np.int32)

# 獲取樣本句長向量（因爲每一個樣本可能包含不一樣多的詞），在這裏統一設爲 num_words - 1，真實情況下根據需要設置
sequence_lengths = np.full(num_examples, num_words - 1, dtype=np.int32)

# 訓練，評估模型
with tf.Graph().as_default():
    with tf.Session() as session:
        x_t = tf.constant(x)
        y_t = tf.constant(y)
        sequence_lengths_t = tf.constant(sequence_lengths)

        # 在這裏設置一個無偏置的線性層
        weights = tf.get_variable("weights", [num_features, num_tags])
        matricized_x_t = tf.reshape(x_t, [-1, num_features])
        matricized_unary_scores = tf.matmul(matricized_x_t, weights)
        unary_scores = tf.reshape(matricized_unary_scores,
                                  [num_examples, num_words, num_tags])

        # 計算log-likelihood並獲得transition_params
        log_likelihood, transition_params = tf.contrib.crf.crf_log_likelihood(
            unary_scores, y_t, sequence_lengths_t)

        # 進行解碼（維特比算法），獲得解碼之後的序列viterbi_sequence和分數viterbi_score
        viterbi_sequence, viterbi_score = tf.contrib.crf.crf_decode(
            unary_scores, transition_params, sequence_lengths_t)

        loss = tf.reduce_mean(-log_likelihood)
        train_op = tf.train.GradientDescentOptimizer(0.01).minimize(loss)

        session.run(tf.global_variables_initializer())

        mask = (np.expand_dims(np.arange(num_words), axis=0) <     # np.arange()創建等差數組
                np.expand_dims(sequence_lengths, axis=1))          # np.expand_dims()擴張維度

        # 得到一個num_examples*num_words的二維數組，數據類型爲布爾型，目的是對句長進行截斷

        # 將每個樣本的sequence_lengths加起來，得到標籤的總數
        total_labels = np.sum(sequence_lengths)

        # 進行訓練
        for i in range(1000):
            tf_viterbi_sequence, _ = session.run([viterbi_sequence, train_op])
            if i % 100 == 0:
                correct_labels = np.sum((y == tf_viterbi_sequence) * mask)
                accuracy = 100.0 * correct_labels / float(total_labels)
                print("Accuracy: %.2f%%" % accuracy)

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Tensorflow1.4.0實現條件隨機場（CRF）

釘釘打卡速度慢

Nginx R31 doc 官方文檔-01-nginx 如何安裝

Python 潮流週刊#51：用 Python 繪製美觀的圖表

Qt/C++音視頻開發74-合併標籤圖形/生成yolo運算結果圖形/文字和圖形合併成一個/水印濾鏡

挑戰程序設計競賽 2.2章習題 POJ - 3617 Best Cow Line 貪心

字節面試：MySQL什麼時候鎖表？如何防止鎖表？

.NET8連接SQL SERVER 2008 R2 報：證書鏈是由不受信任的頒發機構頒發的

golang開發環境搭建(win10)

python計算機視覺學習筆記——PIL庫的用法

Golang初學：獲取程序內存使用情況，std runtime

經驗模式分解（EMD）——簡介及Matlab工具箱安裝

小波變換入門知識總結

機器學習實戰-KNN算法實現及遇到的問題總結

多線性主成分分析（MPCA)簡介

Domain generalization 簡介

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結