kaldi語音識別

1.timit實例

TIMIT全稱The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus, 是由德州儀器(TI)、麻省理工學院(MIT)和坦福研究院(SRI)合作構建的聲學-音素連續語音語料庫。TIMIT數據集的語音採樣頻率爲16kHz,一共包含6300個句子,由來自美國八個主要方言地區的630個人每人說出給定的10個句子,所有的句子都在音素級別(phone level)上進行了手動分割,標記。
TIMIT語料庫多年來已經成爲語音識別社區的一個標準數據庫,在今天仍被廣爲使用。其原因主要有兩個方面:數據集中的每一個句子都在音素級別上進行了手動標記,同時提供了說話人的編號,性別,方言種類等多種信息;數據集相對來說比較小,可以在較短的時間內完成整個實驗;同時又足以展現系統的性能。

1.1 timit數據集下載

本例子的測試需要用到timit提供的原始數據集,但是在kaldi裏邊並未提供該測試數據,我們還需要自己手動下載,其下載連接如下:
Timit數據集下載網址:戳這裏!!!
下載完成之後要將其放到如下目錄
在這裏插入圖片描述
例如此處我的路徑爲

/home/king/kaldi-test/egs/timit/s5/data/TIMIT

如果文件無法找到,大家也可以下載我上傳到csdn的全部語音數據,文件大小爲500多M,由於文件大小限制,我分爲了三部分,這裏我將文件下載鏈接放到下方:

1.TIMIT語音資料庫Part1
2.TIMIT語音資料庫Part2–TRAIN第一部分
3.TIMIT語音資料庫Part3–TRAIN第二部分

1.2修改run.sh

進入egs/timit/s5文件夾路徑下,編輯run.sh

vi run.sh

將裏邊的timit指向的內容指定爲我們剛剛存入數據集的文件路徑,如下所示:
在這裏插入圖片描述
保存退出即可。

1.3 修改運行環境cmd.sh

修改timit/cmd.sh文件,將之前的內容註釋掉,換爲以下內容即可。

# run it locally...
export train_cmd=run.pl
export decode_cmd=run.pl
export cuda_cmd=run.pl
export mkgraph_cmd=run.pl

如下所示:
在這裏插入圖片描述
當然你也可以使用max-jobs-run 來限制最大運行限制數,如果你的機器CPU不是特別多,請一定加上這個條件,防止內存被耗盡。後邊的數目,根據自己電腦的配置修改。

export train_cmd="run.pl --max-jobs-run 10"
export decode_cmd="run.pl --max-jobs-run 10"
export cuda_cmd="run.pl --max-jobs-run 2"
export mkgraph_cmd="run.pl --max-jobs-run 10"

如果你是在 Oracle GridEngine 上運行,那麼請修改成類似的內容:

export train_cmd="queue.pl --mem 4G"
export decode_cmd="queue.pl --mem 4G"
export mkgraph_cmd="queue.pl --mem 8G"
export cuda_cmd="queue.pl --gpu 1"

1.4 運行run.sh(出現錯誤)

在運行./run.sh文件的時候,會提示報錯:提示沒有安裝install_irstlm.sh文件,錯誤如下:
在這裏插入圖片描述
按照提示我們轉到tools/extras文件夾下執行

./install_irstlm.sh

在這裏插入圖片描述
但是執行的時候提示說不在正確的路徑,仔細觀察才發現,原來我現在停留的路徑爲extras,而提示需要退到上一級路徑tools再來執行

./extras/install_irstlm.sh

執行成功之後有如下結果:
在這裏插入圖片描述
我們現在已經安裝成功:在這裏插入圖片描述

1.5 再次執行./run.sh

由於之前已經將所需要的各個組件安裝完畢,所以此次執行完之後沒有再出現錯誤,而是出現了正常的訓練結果:
在這裏插入圖片描述訓練完成後,結果如下所示:
在這裏插入圖片描述

2.訓練結束後生成的各部分文件介紹

2.1 流程介紹

處理流程 作用
local/timit_data_prep.sh 從訓練數據庫egs/timit/s5/data/TIMIT中抽取出訓練數據的目錄位置並寫到egs/timit/s5/data/local/data, 這裏使用的命令src/featbin/wav-to-duration
local/timit_prepare_dict.sh 生成字典數據並放至到/u01/kaldi/egs/timit/s5/data/local/dict,使用的命令/u01/kaldi/tools/irstlm/bin/compile-lm, /u01/kaldi/tools/irstlm/bin/build-lm.sh
utils/prepare_lang.sh 藉助字典數據生成語言模型並放至 /u01/kaldi/egs/timit/s5/data/lang,使用的命令utils/make_lexicon_fst.pl, utils/sym2int.pl, fstcompile, fstaddselfloops, fstarcsort
steps/make_mfcc.sh, steps/compute_cmvn_stats.sh 藉助local/timit_data_prep.sh生成的數據位置抽取出MFCC特徵,數據放到到 /u01/kaldi/egs/timit/s5/data/train,使用的命令compute-mfcc-feats, compute-cmvn-stats, copy-feats, copy-matrix
單因素訓練與解碼 作用
steps/train_mono.sh 藉助前兩步生成的mfcc和語言模型生成單音素,使用命令gmm-init-mono, compile-train-graphs , align-equal-compiled, gmm-acc-stats-ali, gmm-est, gmm-align-compiled
utils/mkgraph.s 生成decoding graph, 使用的命令fsttablecompose, fstminimizeencoded, fstisstochastic, fstcomposecontext, make-h-transducer, fstdeterminizestar, fstrmsymbols, fstrmepslocal, add-self-loops
steps/decode.sh 解碼數據,使用命令gmm-latgen-faster, gmm-decode-faster, compute-wer

2.2 生成結果預覽

訓練完成之後我們可以看到在S5目錄下出現了mfcc文件夾,這就是整個語音數據通過訓練產生的mfcc特徵值,以下爲其展開的目錄結構:
在這裏插入圖片描述我們可以看到這裏的文件都是一些ark文件和scp文件,這就是kaldi訓練生成的,我們可以使用kaldi自帶的 命令來進行解碼,先來看官方給出的查看語句及查看方法:

Copy features [and possibly change format]
Usage: copy-feats [options] <feature-rspecifier> <feature-wspecifier>
or:   copy-feats [options] <feats-rxfilename> <feats-wxfilename>
e.g.: copy-feats ark:- ark,scp:foo.ark,foo.scp
 or: copy-feats ark:foo.ark ark,t:txt.ark
See also: copy-matrix, copy-feats-to-htk, copy-feats-to-sphinx, select-feats,
extract-feature-segments, subset-feats, subsample-feats, splice-feats, paste-feats,
concat-feats

Options:
  --binary                    : Binary-mode output (not relevant if writing to archive) (bool, default = true)
  --compress                  : If true, write output in compressed form(only currently supported for wxfilename, i.e. archive/script,output) (bool, default = false)
  --compression-method        : Only relevant if --compress=true; the method (1 through 7) to compress the matrix.  Search for CompressionMethod in src/matrix/compressed-matrix.h. (int, default = 1)
  --htk-in                    : Read input as HTK features (bool, default = false)
  --sphinx-in                 : Read input as Sphinx features (bool, default = false)
  --write-num-frames          : Wspecifier to write length in frames of each utterance. e.g. 'ark,t:utt2num_frames'.  Only applicable if writing tables, not when this program is writing individual files.  See also feat-to-len. (string, default = "")

Standard options:
  --config                    : Configuration file to read (this option may be repeated) (string, default = "")
  --help                      : Print out usage message (bool, default = false)
  --print-args                : Print the command line arguments (to stderr) (bool, default = true)
  --verbose                   : Verbose level (higher->more logging) (int, default = 0)

根據上邊給出的提示語法,我們可以執行以下語句,將文件解碼至新文件txt.ark文件

/home/king/kaldi-test/src/featbin/copy-feats ark:raw_mfcc_test.8.ark ark,t:txt.ark
#其中前邊的 路徑可以根據自己的文件路徑自行修改爲自己kaldi所在路徑

產生之後如下所示:
在這裏插入圖片描述
對於成功生成的txt.ark文件,我們可以使用cat命令查看
在這裏插入圖片描述
這裏得到的就是我們需要得到的mfcc數據啦~

3.附上./run.sh整個的訓練過程

以下附上整個的訓練過程:

king@kingback19:~/kaldi-test/egs/timit/s5$ ./run.sh 
============================================================================
                Data & Lexicon & Language Preparation                     
============================================================================
wav-to-duration --read-entire-file=true scp:train_wav.scp ark,t:train_dur.ark 
LOG (wav-to-duration[5.5.613~1-56ee1]:main():wav-to-duration.cc:92) Printed duration for 3696 audio files.
LOG (wav-to-duration[5.5.613~1-56ee1]:main():wav-to-duration.cc:94) Mean duration was 3.06336, min and max durations were 0.91525, 7.78881
wav-to-duration --read-entire-file=true scp:dev_wav.scp ark,t:dev_dur.ark 
LOG (wav-to-duration[5.5.613~1-56ee1]:main():wav-to-duration.cc:92) Printed duration for 400 audio files.
LOG (wav-to-duration[5.5.613~1-56ee1]:main():wav-to-duration.cc:94) Mean duration was 3.08212, min and max durations were 1.09444, 7.43681
wav-to-duration --read-entire-file=true scp:test_wav.scp ark,t:test_dur.ark 
LOG (wav-to-duration[5.5.613~1-56ee1]:main():wav-to-duration.cc:92) Printed duration for 192 audio files.
LOG (wav-to-duration[5.5.613~1-56ee1]:main():wav-to-duration.cc:94) Mean duration was 3.03646, min and max durations were 1.30562, 6.21444
Data preparation succeeded
LOGFILE:/dev/null
$bin/ngt -i="$inpfile" -n=$order -gooout=y -o="$gzip -c > $tmpdir/ngram.${sdict}.gz" -fd="$tmpdir/$sdict" $dictionary $additional_parameters >> $logfile 2>&1
$bin/ngt -i="$inpfile" -n=$order -gooout=y -o="$gzip -c > $tmpdir/ngram.${sdict}.gz" -fd="$tmpdir/$sdict" $dictionary $additional_parameters >> $logfile 2>&1
$scr/build-sublm.pl $verbose $prune $prune_thr_str $smoothing "$additional_smoothing_parameters" --size $order --ngrams "$gunzip -c $tmpdir/ngram.${sdict}.gz" -sublm $tmpdir/lm.$sdict $additional_parameters >> $logfile 2>&1
inpfile: data/local/lm_tmp/lm_phone_bg.ilm.gz
outfile: /dev/stdout
loading up to the LM level 1000 (if any)
dub: 10000000
OOV code is 50
OOV code is 50
Saving in txt format to /dev/stdout
Dictionary & language model preparation succeeded
utils/prepare_lang.sh --sil-prob 0.0 --position-dependent-phones false --num-sil-states 3 data/local/dict sil data/local/lang_tmp data/lang
Checking data/local/dict/silence_phones.txt ...
--> reading data/local/dict/silence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/silence_phones.txt is OK

Checking data/local/dict/optional_silence.txt ...
--> reading data/local/dict/optional_silence.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/optional_silence.txt is OK

Checking data/local/dict/nonsilence_phones.txt ...
--> reading data/local/dict/nonsilence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/nonsilence_phones.txt is OK

Checking disjoint: silence_phones.txt, nonsilence_phones.txt
--> disjoint property is OK.

Checking data/local/dict/lexicon.txt
--> reading data/local/dict/lexicon.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/lexicon.txt is OK

Checking data/local/dict/extra_questions.txt ...
--> reading data/local/dict/extra_questions.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/extra_questions.txt is OK
--> SUCCESS [validating dictionary directory data/local/dict]

**Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int 
prepare_lang.sh: validating output directory
utils/validate_lang.pl data/lang
Checking existence of separator file
separator file data/lang/subword_separator.txt is empty or does not exist, deal in word case.
Checking data/lang/phones.txt ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang/phones.txt is OK

Checking words.txt: #0 ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang/words.txt is OK

Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
--> silence.txt and nonsilence.txt are disjoint
--> silence.txt and disambig.txt are disjoint
--> disambig.txt and nonsilence.txt are disjoint
--> disjoint property is OK

Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
--> found no unexplainable phones in phones.txt

Checking data/lang/phones/context_indep.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.{txt, int, csl} are OK

Checking data/lang/phones/nonsilence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 47 entry/entries in data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.{txt, int, csl} are OK

Checking data/lang/phones/silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/silence.txt
--> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.{txt, int, csl} are OK

Checking data/lang/phones/optional_silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.{txt, int, csl} are OK

Checking data/lang/phones/disambig.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 2 entry/entries in data/lang/phones/disambig.txt
--> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.{txt, int, csl} are OK

Checking data/lang/phones/roots.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 48 entry/entries in data/lang/phones/roots.txt
--> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt
--> data/lang/phones/roots.{txt, int} are OK

Checking data/lang/phones/sets.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 48 entry/entries in data/lang/phones/sets.txt
--> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt
--> data/lang/phones/sets.{txt, int} are OK

Checking data/lang/phones/extra_questions.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 2 entry/entries in data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.{txt, int} are OK

Checking optional_silence.txt ...
--> reading data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.txt is OK

Checking disambiguation symbols: #0 and #1
--> data/lang/phones/disambig.txt has "#0" and "#1"
--> data/lang/phones/disambig.txt is OK

Checking topo ...

Checking word-level disambiguation symbols...
--> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh)
Checking data/lang/oov.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/oov.txt
--> data/lang/oov.int corresponds to data/lang/oov.txt
--> data/lang/oov.{txt, int} are OK

--> data/lang/L.fst is olabel sorted
--> data/lang/L_disambig.fst is olabel sorted
--> SUCCESS [validating lang directory data/lang]
Preparing train, dev and test data
utils/validate_data_dir.sh: Successfully validated data-directory data/train
utils/validate_data_dir.sh: Successfully validated data-directory data/dev
utils/validate_data_dir.sh: Successfully validated data-directory data/test
Preparing language models for test
arpa2fst --disambig-symbol=#0 --read-symbol-table=data/lang_test_bg/words.txt - data/lang_test_bg/G.fst 
LOG (arpa2fst[5.5.613~1-56ee1]:Read():arpa-file-parser.cc:94) Reading \data\ section.
LOG (arpa2fst[5.5.613~1-56ee1]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
LOG (arpa2fst[5.5.613~1-56ee1]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
WARNING (arpa2fst[5.5.613~1-56ee1]:ConsumeNGram():arpa-lm-compiler.cc:313) line 60 [-3.26717	<s> <s>] skipped: n-gram has invalid BOS/EOS placement
LOG (arpa2fst[5.5.613~1-56ee1]:RemoveRedundantStates():arpa-lm-compiler.cc:359) Reduced num-states from 50 to 50
fstisstochastic data/lang_test_bg/G.fst 
0.000510126 -0.0763018
utils/validate_lang.pl data/lang_test_bg
Checking existence of separator file
separator file data/lang_test_bg/subword_separator.txt is empty or does not exist, deal in word case.
Checking data/lang_test_bg/phones.txt ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang_test_bg/phones.txt is OK

Checking words.txt: #0 ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang_test_bg/words.txt is OK

Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
--> silence.txt and nonsilence.txt are disjoint
--> silence.txt and disambig.txt are disjoint
--> disambig.txt and nonsilence.txt are disjoint
--> disjoint property is OK

Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
--> found no unexplainable phones in phones.txt

Checking data/lang_test_bg/phones/context_indep.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang_test_bg/phones/context_indep.txt
--> data/lang_test_bg/phones/context_indep.int corresponds to data/lang_test_bg/phones/context_indep.txt
--> data/lang_test_bg/phones/context_indep.csl corresponds to data/lang_test_bg/phones/context_indep.txt
--> data/lang_test_bg/phones/context_indep.{txt, int, csl} are OK

Checking data/lang_test_bg/phones/nonsilence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 47 entry/entries in data/lang_test_bg/phones/nonsilence.txt
--> data/lang_test_bg/phones/nonsilence.int corresponds to data/lang_test_bg/phones/nonsilence.txt
--> data/lang_test_bg/phones/nonsilence.csl corresponds to data/lang_test_bg/phones/nonsilence.txt
--> data/lang_test_bg/phones/nonsilence.{txt, int, csl} are OK

Checking data/lang_test_bg/phones/silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang_test_bg/phones/silence.txt
--> data/lang_test_bg/phones/silence.int corresponds to data/lang_test_bg/phones/silence.txt
--> data/lang_test_bg/phones/silence.csl corresponds to data/lang_test_bg/phones/silence.txt
--> data/lang_test_bg/phones/silence.{txt, int, csl} are OK

Checking data/lang_test_bg/phones/optional_silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang_test_bg/phones/optional_silence.txt
--> data/lang_test_bg/phones/optional_silence.int corresponds to data/lang_test_bg/phones/optional_silence.txt
--> data/lang_test_bg/phones/optional_silence.csl corresponds to data/lang_test_bg/phones/optional_silence.txt
--> data/lang_test_bg/phones/optional_silence.{txt, int, csl} are OK

Checking data/lang_test_bg/phones/disambig.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 2 entry/entries in data/lang_test_bg/phones/disambig.txt
--> data/lang_test_bg/phones/disambig.int corresponds to data/lang_test_bg/phones/disambig.txt
--> data/lang_test_bg/phones/disambig.csl corresponds to data/lang_test_bg/phones/disambig.txt
--> data/lang_test_bg/phones/disambig.{txt, int, csl} are OK

Checking data/lang_test_bg/phones/roots.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 48 entry/entries in data/lang_test_bg/phones/roots.txt
--> data/lang_test_bg/phones/roots.int corresponds to data/lang_test_bg/phones/roots.txt
--> data/lang_test_bg/phones/roots.{txt, int} are OK

Checking data/lang_test_bg/phones/sets.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 48 entry/entries in data/lang_test_bg/phones/sets.txt
--> data/lang_test_bg/phones/sets.int corresponds to data/lang_test_bg/phones/sets.txt
--> data/lang_test_bg/phones/sets.{txt, int} are OK

Checking data/lang_test_bg/phones/extra_questions.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 2 entry/entries in data/lang_test_bg/phones/extra_questions.txt
--> data/lang_test_bg/phones/extra_questions.int corresponds to data/lang_test_bg/phones/extra_questions.txt
--> data/lang_test_bg/phones/extra_questions.{txt, int} are OK

Checking optional_silence.txt ...
--> reading data/lang_test_bg/phones/optional_silence.txt
--> data/lang_test_bg/phones/optional_silence.txt is OK

Checking disambiguation symbols: #0 and #1
--> data/lang_test_bg/phones/disambig.txt has "#0" and "#1"
--> data/lang_test_bg/phones/disambig.txt is OK

Checking topo ...

Checking word-level disambiguation symbols...
--> data/lang_test_bg/phones/wdisambig.txt exists (newer prepare_lang.sh)
Checking data/lang_test_bg/oov.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang_test_bg/oov.txt
--> data/lang_test_bg/oov.int corresponds to data/lang_test_bg/oov.txt
--> data/lang_test_bg/oov.{txt, int} are OK

--> data/lang_test_bg/L.fst is olabel sorted
--> data/lang_test_bg/L_disambig.fst is olabel sorted
--> data/lang_test_bg/G.fst is ilabel sorted
--> data/lang_test_bg/G.fst has 50 states
fstdeterminizestar data/lang_test_bg/G.fst /dev/null 
--> data/lang_test_bg/G.fst is determinizable
--> utils/lang/check_g_properties.pl successfully validated data/lang_test_bg/G.fst
--> utils/lang/check_g_properties.pl succeeded.
--> Testing determinizability of L_disambig . G
fstdeterminizestar 
fsttablecompose data/lang_test_bg/L_disambig.fst data/lang_test_bg/G.fst 
--> L_disambig . G is determinizable
--> SUCCESS [validating lang directory data/lang_test_bg]
Succeeded in formatting data.
============================================================================
         MFCC Feature Extration & CMVN for Training and Test set          
============================================================================
steps/make_mfcc.sh --cmd run.pl --nj 10 data/train exp/make_mfcc/train mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/train
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
steps/make_mfcc.sh: Succeeded creating MFCC features for train
steps/compute_cmvn_stats.sh data/train exp/make_mfcc/train mfcc
Succeeded creating CMVN stats for train
steps/make_mfcc.sh --cmd run.pl --nj 10 data/dev exp/make_mfcc/dev mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/dev
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
steps/make_mfcc.sh: Succeeded creating MFCC features for dev
steps/compute_cmvn_stats.sh data/dev exp/make_mfcc/dev mfcc
Succeeded creating CMVN stats for dev
steps/make_mfcc.sh --cmd run.pl --nj 10 data/test exp/make_mfcc/test mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/test
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
steps/make_mfcc.sh: Succeeded creating MFCC features for test
steps/compute_cmvn_stats.sh data/test exp/make_mfcc/test mfcc
Succeeded creating CMVN stats for test
============================================================================
                     MonoPhone Training & Decoding                        
============================================================================
steps/train_mono.sh --nj 30 --cmd run.pl data/train data/lang exp/mono
steps/train_mono.sh: Initializing monophone system.
steps/train_mono.sh: Compiling training graphs
steps/train_mono.sh: Aligning data equally (pass 0)
steps/train_mono.sh: Pass 1
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 2
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 3
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 4
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 5
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 6
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 7
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 8
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 9
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 10
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 11
steps/train_mono.sh: Pass 12
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 13
steps/train_mono.sh: Pass 14
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 15
steps/train_mono.sh: Pass 16
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 17
steps/train_mono.sh: Pass 18
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 19
steps/train_mono.sh: Pass 20
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 21
steps/train_mono.sh: Pass 22
steps/train_mono.sh: Pass 23
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 24
steps/train_mono.sh: Pass 25
steps/train_mono.sh: Pass 26
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 27
steps/train_mono.sh: Pass 28
steps/train_mono.sh: Pass 29
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 30
steps/train_mono.sh: Pass 31
steps/train_mono.sh: Pass 32
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 33
steps/train_mono.sh: Pass 34
steps/train_mono.sh: Pass 35
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 36
steps/train_mono.sh: Pass 37
steps/train_mono.sh: Pass 38
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 39
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/mono
steps/diagnostic/analyze_alignments.sh: see stats in exp/mono/log/analyze_alignments.log
2 warnings in exp/mono/log/align.*.*.log
exp/mono: nj=30 align prob=-99.15 over 3.12h [retry=0.0%, fail=0.0%] states=144 gauss=987
steps/train_mono.sh: Done training monophone system in exp/mono
tree-info exp/mono/tree 
tree-info exp/mono/tree 
fsttablecompose data/lang_test_bg/L_disambig.fst data/lang_test_bg/G.fst 
fstdeterminizestar --use-log=true 
fstminimizeencoded 
fstpushspecial 
fstisstochastic data/lang_test_bg/tmp/LG.fst 
-0.00841336 -0.00928521
fstcomposecontext --context-size=1 --central-position=0 --read-disambig-syms=data/lang_test_bg/phones/disambig.int --write-disambig-syms=data/lang_test_bg/tmp/disambig_ilabels_1_0.int data/lang_test_bg/tmp/ilabels_1_0.6152 data/lang_test_bg/tmp/LG.fst 
fstisstochastic data/lang_test_bg/tmp/CLG_1_0.fst 
-0.00841336 -0.00928521
make-h-transducer --disambig-syms-out=exp/mono/graph/disambig_tid.int --transition-scale=1.0 data/lang_test_bg/tmp/ilabels_1_0 exp/mono/tree exp/mono/final.mdl 
fstdeterminizestar --use-log=true 
fstrmsymbols exp/mono/graph/disambig_tid.int 
fsttablecompose exp/mono/graph/Ha.fst data/lang_test_bg/tmp/CLG_1_0.fst 
fstrmepslocal 
fstminimizeencoded 
fstisstochastic exp/mono/graph/HCLGa.fst 
0.000381709 -0.00951555
add-self-loops --self-loop-scale=0.1 --reorder=true exp/mono/final.mdl exp/mono/graph/HCLGa.fst 
steps/decode.sh --nj 5 --cmd run.pl exp/mono/graph data/dev exp/mono/decode_dev
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/mono/graph exp/mono/decode_dev
steps/diagnostic/analyze_lats.sh: see stats in exp/mono/decode_dev/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(5,25,121) and mean=56.6
steps/diagnostic/analyze_lats.sh: see stats in exp/mono/decode_dev/log/analyze_lattice_depth_stats.log
steps/decode.sh --nj 5 --cmd run.pl exp/mono/graph data/test exp/mono/decode_test
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/mono/graph exp/mono/decode_test
steps/diagnostic/analyze_lats.sh: see stats in exp/mono/decode_test/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(6,27,142) and mean=72.3
steps/diagnostic/analyze_lats.sh: see stats in exp/mono/decode_test/log/analyze_lattice_depth_stats.log
============================================================================
           tri1 : Deltas + Delta-Deltas Training & Decoding               
============================================================================
steps/align_si.sh --boost-silence 1.25 --nj 30 --cmd run.pl data/train data/lang exp/mono exp/mono_ali
steps/align_si.sh: feature type is delta
steps/align_si.sh: aligning data in data/train using model from exp/mono, putting alignments in exp/mono_ali
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/mono_ali
steps/diagnostic/analyze_alignments.sh: see stats in exp/mono_ali/log/analyze_alignments.log
steps/align_si.sh: done aligning data.
steps/train_deltas.sh --cmd run.pl 2500 15000 data/train data/lang exp/mono_ali exp/tri1
steps/train_deltas.sh: accumulating tree stats
steps/train_deltas.sh: getting questions for tree-building, via clustering
steps/train_deltas.sh: building the tree
steps/train_deltas.sh: converting alignments from exp/mono_ali to use current tree
steps/train_deltas.sh: compiling graphs of transcripts
steps/train_deltas.sh: training pass 1
steps/train_deltas.sh: training pass 2
steps/train_deltas.sh: training pass 3
steps/train_deltas.sh: training pass 4
steps/train_deltas.sh: training pass 5
steps/train_deltas.sh: training pass 6
steps/train_deltas.sh: training pass 7
steps/train_deltas.sh: training pass 8
steps/train_deltas.sh: training pass 9
steps/train_deltas.sh: training pass 10
steps/train_deltas.sh: aligning data
steps/train_deltas.sh: training pass 11
steps/train_deltas.sh: training pass 12
steps/train_deltas.sh: training pass 13
steps/train_deltas.sh: training pass 14
steps/train_deltas.sh: training pass 15
steps/train_deltas.sh: training pass 16
steps/train_deltas.sh: training pass 17
steps/train_deltas.sh: training pass 18
steps/train_deltas.sh: training pass 19
steps/train_deltas.sh: training pass 20
steps/train_deltas.sh: aligning data
steps/train_deltas.sh: training pass 21
steps/train_deltas.sh: training pass 22
steps/train_deltas.sh: training pass 23
steps/train_deltas.sh: training pass 24
steps/train_deltas.sh: training pass 25
steps/train_deltas.sh: training pass 26
steps/train_deltas.sh: training pass 27
steps/train_deltas.sh: training pass 28
steps/train_deltas.sh: training pass 29
steps/train_deltas.sh: training pass 30
steps/train_deltas.sh: aligning data
steps/train_deltas.sh: training pass 31
steps/train_deltas.sh: training pass 32
steps/train_deltas.sh: training pass 33
steps/train_deltas.sh: training pass 34
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri1
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri1/log/analyze_alignments.log
1 warnings in exp/tri1/log/compile_questions.log
69 warnings in exp/tri1/log/init_model.log
44 warnings in exp/tri1/log/update.*.log
exp/tri1: nj=30 align prob=-95.28 over 3.12h [retry=0.0%, fail=0.0%] states=1888 gauss=15044 tree-impr=5.40
steps/train_deltas.sh: Done training system with delta+delta-delta features in exp/tri1
tree-info exp/tri1/tree 
tree-info exp/tri1/tree 
fstcomposecontext --context-size=3 --central-position=1 --read-disambig-syms=data/lang_test_bg/phones/disambig.int --write-disambig-syms=data/lang_test_bg/tmp/disambig_ilabels_3_1.int data/lang_test_bg/tmp/ilabels_3_1.3130 data/lang_test_bg/tmp/LG.fst 
fstisstochastic data/lang_test_bg/tmp/CLG_3_1.fst 
0 -0.00928518
make-h-transducer --disambig-syms-out=exp/tri1/graph/disambig_tid.int --transition-scale=1.0 data/lang_test_bg/tmp/ilabels_3_1 exp/tri1/tree exp/tri1/final.mdl 
fstdeterminizestar --use-log=true 
fsttablecompose exp/tri1/graph/Ha.fst data/lang_test_bg/tmp/CLG_3_1.fst 
fstminimizeencoded 
fstrmepslocal 
fstrmsymbols exp/tri1/graph/disambig_tid.int 
fstisstochastic exp/tri1/graph/HCLGa.fst 
0.000449687 -0.0175772
HCLGa is not stochastic
add-self-loops --self-loop-scale=0.1 --reorder=true exp/tri1/final.mdl exp/tri1/graph/HCLGa.fst 
steps/decode.sh --nj 5 --cmd run.pl exp/tri1/graph data/dev exp/tri1/decode_dev
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri1/graph exp/tri1/decode_dev
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode_dev/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(3,11,42) and mean=19.0
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode_dev/log/analyze_lattice_depth_stats.log
steps/decode.sh --nj 5 --cmd run.pl exp/tri1/graph data/test exp/tri1/decode_test
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri1/graph exp/tri1/decode_test
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode_test/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(3,12,48) and mean=21.6
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode_test/log/analyze_lattice_depth_stats.log
============================================================================
                 tri2 : LDA + MLLT Training & Decoding                    
============================================================================
steps/align_si.sh --nj 30 --cmd run.pl data/train data/lang exp/tri1 exp/tri1_ali
steps/align_si.sh: feature type is delta
steps/align_si.sh: aligning data in data/train using model from exp/tri1, putting alignments in exp/tri1_ali
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri1_ali
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri1_ali/log/analyze_alignments.log
steps/align_si.sh: done aligning data.
steps/train_lda_mllt.sh --cmd run.pl --splice-opts --left-context=3 --right-context=3 2500 15000 data/train data/lang exp/tri1_ali exp/tri2
steps/train_lda_mllt.sh: Accumulating LDA statistics.
steps/train_lda_mllt.sh: Accumulating tree stats
steps/train_lda_mllt.sh: Getting questions for tree clustering.
steps/train_lda_mllt.sh: Building the tree
steps/train_lda_mllt.sh: Initializing the model
steps/train_lda_mllt.sh: Converting alignments from exp/tri1_ali to use current tree
steps/train_lda_mllt.sh: Compiling graphs of transcripts
Training pass 1
Training pass 2
steps/train_lda_mllt.sh: Estimating MLLT
Training pass 3
Training pass 4
steps/train_lda_mllt.sh: Estimating MLLT
Training pass 5
Training pass 6
steps/train_lda_mllt.sh: Estimating MLLT
Training pass 7
Training pass 8
Training pass 9
Training pass 10
Aligning data
Training pass 11
Training pass 12
steps/train_lda_mllt.sh: Estimating MLLT
Training pass 13
Training pass 14
Training pass 15
Training pass 16
Training pass 17
Training pass 18
Training pass 19
Training pass 20
Aligning data
Training pass 21
Training pass 22
Training pass 23
Training pass 24
Training pass 25
Training pass 26
Training pass 27
Training pass 28
Training pass 29
Training pass 30
Aligning data
Training pass 31
Training pass 32
Training pass 33
Training pass 34
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri2
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri2/log/analyze_alignments.log
96 warnings in exp/tri2/log/init_model.log
1 warnings in exp/tri2/log/compile_questions.log
152 warnings in exp/tri2/log/update.*.log
exp/tri2: nj=30 align prob=-47.90 over 3.12h [retry=0.0%, fail=0.0%] states=2024 gauss=15024 tree-impr=5.58 lda-sum=28.50 mllt:impr,logdet=1.63,2.19
steps/train_lda_mllt.sh: Done training system with LDA+MLLT features in exp/tri2
tree-info exp/tri2/tree 
tree-info exp/tri2/tree 
make-h-transducer --disambig-syms-out=exp/tri2/graph/disambig_tid.int --transition-scale=1.0 data/lang_test_bg/tmp/ilabels_3_1 exp/tri2/tree exp/tri2/final.mdl 
fstdeterminizestar --use-log=true 
fstrmepslocal 
fstminimizeencoded 
fstrmsymbols exp/tri2/graph/disambig_tid.int 
fsttablecompose exp/tri2/graph/Ha.fst data/lang_test_bg/tmp/CLG_3_1.fst 
fstisstochastic exp/tri2/graph/HCLGa.fst 
0.000462607 -0.0175772
HCLGa is not stochastic
add-self-loops --self-loop-scale=0.1 --reorder=true exp/tri2/final.mdl exp/tri2/graph/HCLGa.fst 
steps/decode.sh --nj 5 --cmd run.pl exp/tri2/graph data/dev exp/tri2/decode_dev
decode.sh: feature type is lda
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri2/graph exp/tri2/decode_dev
steps/diagnostic/analyze_lats.sh: see stats in exp/tri2/decode_dev/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(2,8,28) and mean=13.2
steps/diagnostic/analyze_lats.sh: see stats in exp/tri2/decode_dev/log/analyze_lattice_depth_stats.log
steps/decode.sh --nj 5 --cmd run.pl exp/tri2/graph data/test exp/tri2/decode_test
decode.sh: feature type is lda
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri2/graph exp/tri2/decode_test
steps/diagnostic/analyze_lats.sh: see stats in exp/tri2/decode_test/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(2,9,32) and mean=14.3
steps/diagnostic/analyze_lats.sh: see stats in exp/tri2/decode_test/log/analyze_lattice_depth_stats.log
============================================================================
              tri3 : LDA + MLLT + SAT Training & Decoding                 
============================================================================
steps/align_si.sh --nj 30 --cmd run.pl --use-graphs true data/train data/lang exp/tri2 exp/tri2_ali
steps/align_si.sh: feature type is lda
steps/align_si.sh: aligning data in data/train using model from exp/tri2, putting alignments in exp/tri2_ali
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri2_ali
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri2_ali/log/analyze_alignments.log
steps/align_si.sh: done aligning data.
steps/train_sat.sh --cmd run.pl 2500 15000 data/train data/lang exp/tri2_ali exp/tri3
steps/train_sat.sh: feature type is lda
steps/train_sat.sh: obtaining initial fMLLR transforms since not present in exp/tri2_ali
steps/train_sat.sh: Accumulating tree stats
steps/train_sat.sh: Getting questions for tree clustering.
steps/train_sat.sh: Building the tree
steps/train_sat.sh: Initializing the model
steps/train_sat.sh: Converting alignments from exp/tri2_ali to use current tree
steps/train_sat.sh: Compiling graphs of transcripts
Pass 1
Pass 2
Estimating fMLLR transforms
Pass 3
Pass 4
Estimating fMLLR transforms
Pass 5
Pass 6
Estimating fMLLR transforms
Pass 7
Pass 8
Pass 9
Pass 10
Aligning data
Pass 11
Pass 12
Estimating fMLLR transforms
Pass 13
Pass 14
Pass 15
Pass 16
Pass 17
Pass 18
Pass 19
Pass 20
Aligning data
Pass 21
Pass 22
Pass 23
Pass 24
Pass 25
Pass 26
Pass 27
Pass 28
Pass 29
Pass 30
Aligning data
Pass 31
Pass 32
Pass 33
Pass 34
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri3
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri3/log/analyze_alignments.log
1 warnings in exp/tri3/log/est_alimdl.log
38 warnings in exp/tri3/log/init_model.log
1 warnings in exp/tri3/log/compile_questions.log
16 warnings in exp/tri3/log/update.*.log
steps/train_sat.sh: Likelihood evolution:
-50.1967 -49.3093 -49.107 -48.9006 -48.2007 -47.5136 -47.0927 -46.8195 -46.5561 -46.0203 -45.765 -45.4413 -45.2516 -45.1079 -44.9839 -44.8753 -44.7675 -44.6596 -44.5571 -44.3946 -44.2562 -44.168 -44.0857 -44.0061 -43.9282 -43.8526 -43.7788 -43.7062 -43.6342 -43.5389 -43.4641 -43.4386 -43.4224 -43.4104 
exp/tri3: nj=30 align prob=-47.05 over 3.12h [retry=0.0%, fail=0.0%] states=1928 gauss=15017 fmllr-impr=4.07 over 2.79h tree-impr=8.75
steps/train_sat.sh: done training SAT system in exp/tri3
tree-info exp/tri3/tree 
tree-info exp/tri3/tree 
make-h-transducer --disambig-syms-out=exp/tri3/graph/disambig_tid.int --transition-scale=1.0 data/lang_test_bg/tmp/ilabels_3_1 exp/tri3/tree exp/tri3/final.mdl 
fsttablecompose exp/tri3/graph/Ha.fst data/lang_test_bg/tmp/CLG_3_1.fst 
fstdeterminizestar --use-log=true 
fstrmsymbols exp/tri3/graph/disambig_tid.int 
fstminimizeencoded 
fstrmepslocal 
fstisstochastic exp/tri3/graph/HCLGa.fst 
0.000461769 -0.0175772
HCLGa is not stochastic
add-self-loops --self-loop-scale=0.1 --reorder=true exp/tri3/final.mdl exp/tri3/graph/HCLGa.fst 
steps/decode_fmllr.sh --nj 5 --cmd run.pl exp/tri3/graph data/dev exp/tri3/decode_dev
steps/decode.sh --scoring-opts  --num-threads 1 --skip-scoring false --acwt 0.083333 --nj 5 --cmd run.pl --beam 10.0 --model exp/tri3/final.alimdl --max-active 2000 exp/tri3/graph data/dev exp/tri3/decode_dev.si
decode.sh: feature type is lda
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri3/graph exp/tri3/decode_dev.si
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_dev.si/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(2,9,33) and mean=15.3
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_dev.si/log/analyze_lattice_depth_stats.log
steps/decode_fmllr.sh: feature type is lda
steps/decode_fmllr.sh: getting first-pass fMLLR transforms.
steps/decode_fmllr.sh: doing main lattice generation phase
steps/decode_fmllr.sh: estimating fMLLR transforms a second time.
steps/decode_fmllr.sh: doing a final pass of acoustic rescoring.
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri3/graph exp/tri3/decode_dev
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_dev/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(1,5,16) and mean=7.6
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_dev/log/analyze_lattice_depth_stats.log
steps/decode_fmllr.sh --nj 5 --cmd run.pl exp/tri3/graph data/test exp/tri3/decode_test
steps/decode.sh --scoring-opts  --num-threads 1 --skip-scoring false --acwt 0.083333 --nj 5 --cmd run.pl --beam 10.0 --model exp/tri3/final.alimdl --max-active 2000 exp/tri3/graph data/test exp/tri3/decode_test.si
decode.sh: feature type is lda
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri3/graph exp/tri3/decode_test.si
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_test.si/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(2,10,35) and mean=16.2
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_test.si/log/analyze_lattice_depth_stats.log
steps/decode_fmllr.sh: feature type is lda
steps/decode_fmllr.sh: getting first-pass fMLLR transforms.
steps/decode_fmllr.sh: doing main lattice generation phase
steps/decode_fmllr.sh: estimating fMLLR transforms a second time.
steps/decode_fmllr.sh: doing a final pass of acoustic rescoring.
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri3/graph exp/tri3/decode_test
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_test/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(1,5,18) and mean=8.6
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_test/log/analyze_lattice_depth_stats.log
============================================================================
                        SGMM2 Training & Decoding                         
============================================================================
steps/align_fmllr.sh --nj 30 --cmd run.pl data/train data/lang exp/tri3 exp/tri3_ali
steps/align_fmllr.sh: feature type is lda
steps/align_fmllr.sh: compiling training graphs
steps/align_fmllr.sh: aligning data in data/train using exp/tri3/final.alimdl and speaker-independent features.
steps/align_fmllr.sh: computing fMLLR transforms
steps/align_fmllr.sh: doing final alignment.
steps/align_fmllr.sh: done aligning data.
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri3_ali
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri3_ali/log/analyze_alignments.log

4.參考文檔

此篇博客的順利完成,首先要感謝以下幾篇博客對於我的幫助,沒有以下幾篇博客的幫助,也很難得以成功。再次感謝以下幾位大佬提供的幫助。
Kaldi在語音數據庫timit上的聲學和語音模型訓練–1
Ubuntu16.04用kaldi跑timit數據集

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章