【語音識別】日語語音識別系統Julius(v4.4)的基於DNN的識別(5月8號:識別結果更新)

發現國內對於Julius的資料太少了,現在補充一下。Julius最新更新於2016.9,加入了基於DNN的識別,但實際使用的時候發現有很多必要條件並沒有在homepage上標明出來。現在做一個00readme-DNN(←)的翻譯。日本人的英語很多語法問題,特地附上原文。

A. Julius and DNN-HMM
======================


From 4.4, Julius can perform DNN-HMM based recognition in two ways:


  1. standalone: directly compute DNN for HMM inside Julius (>= 4.4)     //1.單機:直接爲HMM構建DNN在julius裏(版本>= 4.4)【本文僅翻譯這一塊】


  2. network: receive state probabilities calculated by other process    
     via socket (<= 4.3.1)


Both are described below.


 A.1. Standalone mode
 =====================

From version 4.4, Julius is capable of performing DNN-HMM based recognition by itself.  It can read a DNN definition along with a HMM, and can compute the network against input (spliced) feature vectors and output the node scores of output layer for each frame, which will be used as output probabilities of corresponding HMM states in the
HMM.  All computation will be done in a single process.

// 從版本4.4開始,julius可以選擇DNN-HMM進行識別。julius中的HMM可以讀取一個DNN的定義,並且能夠使用輸入(拼接)特徵向量建立網絡並輸出每一幀的輸出層的node scores。這將被用作HMM輸出可能性。所有構建是單線程。



Note that the current implementation is very simple and limited.  Only basic functions are implemented for NN.  Any number of hidden layers can be defined, but the number of the nodes in the hidden layers should be the same.  No batch computation is performed: all frame-wise.  SIMD instruction (Intel AVX) is used to speed up the computation.  Only tested on Windows and Ubuntu on Intel PC.See "libsent/src/phmm/calc_dnn.c" for the actual implementation.


//注意的是,目前是非常簡單和有限的功能。只有NN的基本功能。隱藏層數可以定義,但與隱藏層中的node數應該相同。沒有bach供選擇:所有幀長。SIMD指令(ntel AVX)被用作加速這個構建。只在Intel PC的Windows和Ubuntu進行了測試。看 "libsent/src/phmm/calc_dnn.c" 可以得到實際的更新信息。


o run, you need // 你需要


 1) an HMM AM (GMM defs are ignored, only its structure is used)       //一個HMM聲學模型
 2) a DNN definition that corresponds to 1) //一個與上1一致的DNN定義
 3) ".dnnconf" configuration file (text) // ".dnnconf"


The .dnnconf file specifies the parameters, options, DNN definition files, and other parameters all relating to DNN computation. A sample file is located in the top directory of Julius archive as "Sample.dnnconf".

// ".dnnconf"文件寫明瞭參數,選項,DNN定義文件和其他與構建DNN相關的參數。給了一個樣例在 "Sample.dnnconf"。


The matrix/vector definitions should be given in ".npy" format(i. e. python's "NumPy.save" format).  Only 32bit-float little endian datatype is acceptable.


//矩陣向量應該定義成".npy" 形式(比如python's "NumPy.save" )。只有32bit 小端 數據類型被接受。


To prepare a model for DNN-HMM, note that the orders are important.The order of the output nodes in the DNN should be the order of HMM state definition id.  If not, Julius won't work properly.

//順序很重要。DNN的輸出是HMM的狀態定義。否則,無法正確運行。


Julius uses SIMD instruction for internal DNN computation. For Intel CPU, dispatch function for several Intel SIMD instruction sets (SSE, AVX and FMA) are implemented. You need gcc-4.7 or later to compile all the codes.  They are all compiled and built-in into Julius, and will be determined which one to use at run time.  Run "julius -setting" and see which code will be used on your cpu.  AVX can be run on Sandy Bridge, and FMA on Haswell, later one will run faster.  And for ARM architecture, you can enable NEON SIMD codes by adding "--enable-neon" to configure.

//Julius在DNN構建中使用的是SIMD指令。對於Intel CPU,有很多類型的指令類型(SSE, AVX and FMA)。你需要至少gcc-4.7或更高版本。Julius已經包含這些了,你可以定義用哪個在運行的時候。運行"julius -setting" ,看什麼code類型將被用在你的cpu。 AVX can be run on Sandy Bridge, and FMA on Haswell, later one will run faster.  And for ARM architecture, you can enable NEON SIMD codes by adding "--enable-neon" to configure.


--------------------------------

自己的感覺就是更新了很侷限的一些功能,嘗試後發現出error,找不到原因才仔細去讀這些說明文件發現有很多限定條件。大家多注意。

--------------------------------

5.8更新:

【重要】4.4版本這個DNN-HMM聲學模型(.SID)在使用的時候,老版本(4.3)

julius.dnnconf DNN(Julius単體)版の特徴量変換設定ファイル

這個是沒有的,4.4一定注意要用上這個,否則會一直提示你的特徵量輸入不對。


在32Bit服務器上跑完了,大概2W條語音用了35小時左右,對比了4.3版本的結果發現是有不一樣的,自己篩選幾條來看識別結果是要好些,等識別率計算好了再寫上來。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章