前言

代碼可在Github上下載:代碼下載
隱馬爾可夫模型在自然語言處理等各領域中，經常被用來處理標註問題。

隱馬爾可夫模型由初始狀態概率向量 $π$ 、狀態轉移概率矩陣A和觀測概率矩陣B決定。其中 $π$ 和A決定狀態序列，B決定觀測序列。
$λ = (A, B, π)$
$A = {[a_{i j}]}_{N \times N}$
其中 $a_{i j} = P (i_{t + 1} = q_{j} | i_{t} = q_{i})$ ，表示t時刻的狀態 $q_{i}$ 轉移到t+1時刻的狀態 $q_{j} 的概率$
$B = {[b_{j} (k)]}_{N \times M}$
其中 $b_{j} (k) = P (o_{t} = v_{k} | i_{t} = q_{j})$ ，表示t時刻 $q_{j}$ 生成觀測 $v_{k}$ 的概率
$π = (π_{i})$
其中 $π_{i} = P (i_{1} = q_{i})$ 表示t=1時刻處於 $q_{i}$ 的概率

其中隱馬爾可夫有兩個基本假設：

（1）齊次馬爾科夫行假設
（2）觀測獨立性假設

有了以上定義，我們來試圖解決隱馬爾可夫模型有3個基本問題：（1）概率計算問題（2）學習問題（3）預測問題。
本文主要實現了隱馬爾可夫模型中的前向算法，後向算法，以及維特比算法。重點講下前向算法，如果前向算法能理解，那後向算法和維特比算法也將迎刃而解。

概率計算算法

概率計算算法通常有（1）直接計算法（2）前向算法（3）後向算法。

前向算法實現

def forward(self, Q, V, A, B, O, PI):  # 使用前向算法
    N = len(Q)  # 狀態序列的大小
    M = len(O)  # 觀測序列的大小
    alphas = np.zeros((N, M))  # alpha值
    T = M  # 有幾個時刻，有幾個觀測序列，就有幾個時刻
    for t in range(T):  # 遍歷每一時刻，算出alpha值
        indexOfO = V.index(O[t])  # 找出序列對應的索引
        for i in range(N):
            if t == 0:  # 計算初值
                alphas[i][t] = PI[t][i] * B[i][indexOfO]  # P176（10.15）
                print('alpha1(%d)=p%db%db(o1)=%f' % (i, i, i, alphas[i][t]))
            else:
                alphas[i][t] = np.dot([alpha[t - 1] for alpha in alphas], [a[i] for a in A]) * B[i][
                    indexOfO]  # 對應P176（10.16）
                print('alpha%d(%d)=[sigma alpha%d(i)ai%d]b%d(o%d)=%f' % (t, i, t - 1, i, i, t, alphas[i][t]))
                # print(alphas)
    P = np.sum([alpha[M - 1] for alpha in alphas])  # P176(10.17)
    # alpha11 = pi[0][0] * B[0][0]    #代表a1(1)
    # alpha12 = pi[0][1] * B[1][0]    #代表a1(2)
    # alpha13 = pi[0][2] * B[2][0]    #代表a1(3)

首先需要一個矩陣來存儲alpha值alphas = np.zeros((N, M)) #alpha值，這裏定義了一個 $N \times M$ 大小的矩陣，每一列是一個時刻存儲的alpha值，總共有M（T個時刻）列。
按照書上P175算法10.2，需要遍歷每個時刻，所以for t in range(T): #遍歷每一時刻，算出alpha值，然後每種時刻需要計算N個狀態，也就是for i in range(N):。
當t=0時（由於大多數計算機編程的數組大多是從0開始，所以這裏的0代表是時刻t=1），我們需要計算初值，p176（10.15）

if t == 0:  # 計算初值
      alphas[i][t] = PI[t][i] * B[i][indexOfO]

然後，遞推算出，
alphas[i][t] = np.sum(np.multiply([alpha[t-1] for alpha in alphas], [a[i] for a in A])) * B[i][indexOfO] #對應P176（10.16）等到執行完畢後，我們就得到了一個alpha矩陣了，這是這個算法最終的東西了。
也許看到這裏，有人會無法理解上面這段代碼和公式的寫法。這裏我解釋一下，通常看到 $\sum_{j = 1}^{N} α_{i} (j) a_{j i}$ ，第一反應是會先想到用一個for循環來做，但是我這裏的做法是提取出兩個向量，然後求內積(使用np.dot()，這是一個可以求出內積的函數)，這也是吳恩達所推薦的一種做法，原因對向量操作會for循環效率會高。
最後根據書上的公式得到最終的P，代碼如下。

P = np.sum([alpha[M - 1] for alpha in alphas])  # P176(10.17)

好，這裏也就是本次前向算法的實現。

後向算法實現

有了前向算法的實現，我們同理可以得到後向算法。

def backward(self, Q, V, A, B, O, PI):  # 後向算法
    N = len(Q)  # 狀態序列的大小
    M = len(O)  # 觀測序列的大小
    betas = np.ones((N, M))  # beta
    for i in range(N):
        print('beta%d(%d)=1' % (M, i))
    for t in range(M - 2, -1, -1):
        indexOfO = V.index(O[t + 1])  # 找出序列對應的索引
        for i in range(N):
            betas[i][t] = np.dot(np.multiply(A[i], [b[indexOfO] for b in B]), [beta[t + 1] for beta in betas])
            realT = t + 1
            realI = i + 1
            print('beta%d(%d)=[sigma a%djbj(o%d)]beta%d(j)=(' % (realT, realI, realI, realT + 1, realT + 1),
                  end='')
            for j in range(N):
                print("%.2f*%.2f*%.2f+" % (A[i][j], B[j][indexOfO], betas[j][t + 1]), end='')
            print("0)=%.3f" % betas[i][t])
    # print(betas)
    indexOfO = V.index(O[0])
    P = np.dot(np.multiply(PI, [b[indexOfO] for b in B]), [beta[0] for beta in betas])
    print("P(O|lambda)=", end="")
    for i in range(N):
        print("%.1f*%.1f*%.5f+" % (PI[0][i], B[i][indexOfO], betas[i][0]), end="")
    print("0=%f" % P)

預測算法

類似的，等你學會實現前向算法，維特比算法也是依葫蘆畫瓢。這裏不再贅述，直接給出算法，該算法可以用來預測出狀態。

def viterbi(self, Q, V, A, B, O, PI):
    N = len(Q)  # 狀態序列的大小
    M = len(O)  # 觀測序列的大小
    deltas = np.zeros((N, M))
    psis = np.zeros((N, M))
    I = np.zeros((1, M))
    for t in range(M):
        realT = t+1
        indexOfO = V.index(O[t])  # 找出序列對應的索引
        for i in range(N):
            realI = i+1
            if t == 0:
                deltas[i][t] = PI[0][i] * B[i][indexOfO]
                psis[i][t] = 0
                print('delta1(%d)=pi%d * b%d(o1)=%.2f * %.2f=%.2f'%(realI, realI, realI, PI[0][i], B[i][indexOfO], deltas[i][t]))
                print('psis1(%d)=0' % (realI))
            else:
                deltas[i][t] = np.max(np.multiply([delta[t-1] for delta in deltas], [a[i] for a in A])) * B[i][indexOfO]
                print('delta%d(%d)=max[delta%d(j)aj%d]b%d(o%d)=%.2f*%.2f=%.5f'%(realT, realI, realT-1, realI, realI, realT, np.max(np.multiply([delta[t-1] for delta in deltas], [a[i] for a in A])), B[i][indexOfO], deltas[i][t]))
                psis[i][t] = np.argmax(np.multiply([delta[t-1] for delta in deltas], [a[i] for a in A]))
                print('psis%d(%d)=argmax[delta%d(j)aj%d]=%d' % (realT, realI, realT-1, realI, psis[i][t]))
    print(deltas)
    print(psis)
    I[0][M-1] = np.argmax([delta[M-1] for delta in deltas])
    print('i%d=argmax[deltaT(i)]=%d' % (M, I[0][M-1]+1))
    for t in range(M-2, -1, -1):
        I[0][t] = psis[int(I[0][t+1])][t+1]
        print('i%d=psis%d(i%d)=%d' % (t+1, t+2, t+2, I[0][t]+1))
    print(I)=argmax[deltaT(i)]=%d' % (M, I[0][M-1]+1))
        for t in range(M-2, -1, -1):
            I[0][t] = psis[int(I[0][t+1])][t+1]
            print('i%d=psis%d(i%d)=%d' % (t+1, t+2, t+2, I[0][t]+1))
        print(I)

【統計學習方法】隱馬爾可夫模型(HMM) Python實現

前言

概率計算算法

前向算法實現

後向算法實現

預測算法

lightdb hash index的性能和限制

【Python】問題小記錄

【自然語言處理】tf.contrib.seq2seq.dynamic_decode源碼分析

[數據結構]單鏈表C語言的簡單實現

[數據結構]圖鄰接矩陣C語言簡單實現

[數據結構]棧的C語言簡單實現

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結