kaldi语音识别

1.timit实例

TIMIT全称The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus, 是由德州仪器(TI)、麻省理工学院(MIT)和坦福研究院(SRI)合作构建的声学-音素连续语音语料库。TIMIT数据集的语音采样频率为16kHz,一共包含6300个句子,由来自美国八个主要方言地区的630个人每人说出给定的10个句子,所有的句子都在音素级别(phone level)上进行了手动分割,标记。
TIMIT语料库多年来已经成为语音识别社区的一个标准数据库,在今天仍被广为使用。其原因主要有两个方面:数据集中的每一个句子都在音素级别上进行了手动标记,同时提供了说话人的编号,性别,方言种类等多种信息;数据集相对来说比较小,可以在较短的时间内完成整个实验;同时又足以展现系统的性能。

1.1 timit数据集下载

本例子的测试需要用到timit提供的原始数据集,但是在kaldi里边并未提供该测试数据,我们还需要自己手动下载,其下载连接如下:
Timit数据集下载网址:戳这里!!!
下载完成之后要将其放到如下目录
在这里插入图片描述
例如此处我的路径为

/home/king/kaldi-test/egs/timit/s5/data/TIMIT

如果文件无法找到,大家也可以下载我上传到csdn的全部语音数据,文件大小为500多M,由于文件大小限制,我分为了三部分,这里我将文件下载链接放到下方:

1.TIMIT语音资料库Part1
2.TIMIT语音资料库Part2–TRAIN第一部分
3.TIMIT语音资料库Part3–TRAIN第二部分

1.2修改run.sh

进入egs/timit/s5文件夹路径下,编辑run.sh

vi run.sh

将里边的timit指向的内容指定为我们刚刚存入数据集的文件路径,如下所示:
在这里插入图片描述
保存退出即可。

1.3 修改运行环境cmd.sh

修改timit/cmd.sh文件,将之前的内容注释掉,换为以下内容即可。

# run it locally...
export train_cmd=run.pl
export decode_cmd=run.pl
export cuda_cmd=run.pl
export mkgraph_cmd=run.pl

如下所示:
在这里插入图片描述
当然你也可以使用max-jobs-run 来限制最大运行限制数,如果你的机器CPU不是特别多,请一定加上这个条件,防止内存被耗尽。后边的数目,根据自己电脑的配置修改。

export train_cmd="run.pl --max-jobs-run 10"
export decode_cmd="run.pl --max-jobs-run 10"
export cuda_cmd="run.pl --max-jobs-run 2"
export mkgraph_cmd="run.pl --max-jobs-run 10"

如果你是在 Oracle GridEngine 上运行,那么请修改成类似的内容:

export train_cmd="queue.pl --mem 4G"
export decode_cmd="queue.pl --mem 4G"
export mkgraph_cmd="queue.pl --mem 8G"
export cuda_cmd="queue.pl --gpu 1"

1.4 运行run.sh(出现错误)

在运行./run.sh文件的时候,会提示报错:提示没有安装install_irstlm.sh文件,错误如下:
在这里插入图片描述
按照提示我们转到tools/extras文件夹下执行

./install_irstlm.sh

在这里插入图片描述
但是执行的时候提示说不在正确的路径,仔细观察才发现,原来我现在停留的路径为extras,而提示需要退到上一级路径tools再来执行

./extras/install_irstlm.sh

执行成功之后有如下结果:
在这里插入图片描述
我们现在已经安装成功:在这里插入图片描述

1.5 再次执行./run.sh

由于之前已经将所需要的各个组件安装完毕,所以此次执行完之后没有再出现错误,而是出现了正常的训练结果:
在这里插入图片描述训练完成后,结果如下所示:
在这里插入图片描述

2.训练结束后生成的各部分文件介绍

2.1 流程介绍

处理流程 作用
local/timit_data_prep.sh 从训练数据库egs/timit/s5/data/TIMIT中抽取出训练数据的目录位置并写到egs/timit/s5/data/local/data, 这里使用的命令src/featbin/wav-to-duration
local/timit_prepare_dict.sh 生成字典数据并放至到/u01/kaldi/egs/timit/s5/data/local/dict,使用的命令/u01/kaldi/tools/irstlm/bin/compile-lm, /u01/kaldi/tools/irstlm/bin/build-lm.sh
utils/prepare_lang.sh 借助字典数据生成语言模型并放至 /u01/kaldi/egs/timit/s5/data/lang,使用的命令utils/make_lexicon_fst.pl, utils/sym2int.pl, fstcompile, fstaddselfloops, fstarcsort
steps/make_mfcc.sh, steps/compute_cmvn_stats.sh 借助local/timit_data_prep.sh生成的数据位置抽取出MFCC特征,数据放到到 /u01/kaldi/egs/timit/s5/data/train,使用的命令compute-mfcc-feats, compute-cmvn-stats, copy-feats, copy-matrix
单因素训练与解码 作用
steps/train_mono.sh 借助前两步生成的mfcc和语言模型生成单音素,使用命令gmm-init-mono, compile-train-graphs , align-equal-compiled, gmm-acc-stats-ali, gmm-est, gmm-align-compiled
utils/mkgraph.s 生成decoding graph, 使用的命令fsttablecompose, fstminimizeencoded, fstisstochastic, fstcomposecontext, make-h-transducer, fstdeterminizestar, fstrmsymbols, fstrmepslocal, add-self-loops
steps/decode.sh 解码数据,使用命令gmm-latgen-faster, gmm-decode-faster, compute-wer

2.2 生成结果预览

训练完成之后我们可以看到在S5目录下出现了mfcc文件夹,这就是整个语音数据通过训练产生的mfcc特征值,以下为其展开的目录结构:
在这里插入图片描述我们可以看到这里的文件都是一些ark文件和scp文件,这就是kaldi训练生成的,我们可以使用kaldi自带的 命令来进行解码,先来看官方给出的查看语句及查看方法:

Copy features [and possibly change format]
Usage: copy-feats [options] <feature-rspecifier> <feature-wspecifier>
or:   copy-feats [options] <feats-rxfilename> <feats-wxfilename>
e.g.: copy-feats ark:- ark,scp:foo.ark,foo.scp
 or: copy-feats ark:foo.ark ark,t:txt.ark
See also: copy-matrix, copy-feats-to-htk, copy-feats-to-sphinx, select-feats,
extract-feature-segments, subset-feats, subsample-feats, splice-feats, paste-feats,
concat-feats

Options:
  --binary                    : Binary-mode output (not relevant if writing to archive) (bool, default = true)
  --compress                  : If true, write output in compressed form(only currently supported for wxfilename, i.e. archive/script,output) (bool, default = false)
  --compression-method        : Only relevant if --compress=true; the method (1 through 7) to compress the matrix.  Search for CompressionMethod in src/matrix/compressed-matrix.h. (int, default = 1)
  --htk-in                    : Read input as HTK features (bool, default = false)
  --sphinx-in                 : Read input as Sphinx features (bool, default = false)
  --write-num-frames          : Wspecifier to write length in frames of each utterance. e.g. 'ark,t:utt2num_frames'.  Only applicable if writing tables, not when this program is writing individual files.  See also feat-to-len. (string, default = "")

Standard options:
  --config                    : Configuration file to read (this option may be repeated) (string, default = "")
  --help                      : Print out usage message (bool, default = false)
  --print-args                : Print the command line arguments (to stderr) (bool, default = true)
  --verbose                   : Verbose level (higher->more logging) (int, default = 0)

根据上边给出的提示语法,我们可以执行以下语句,将文件解码至新文件txt.ark文件

/home/king/kaldi-test/src/featbin/copy-feats ark:raw_mfcc_test.8.ark ark,t:txt.ark
#其中前边的 路径可以根据自己的文件路径自行修改为自己kaldi所在路径

产生之后如下所示:
在这里插入图片描述
对于成功生成的txt.ark文件,我们可以使用cat命令查看
在这里插入图片描述
这里得到的就是我们需要得到的mfcc数据啦~

3.附上./run.sh整个的训练过程

以下附上整个的训练过程:

king@kingback19:~/kaldi-test/egs/timit/s5$ ./run.sh 
============================================================================
                Data & Lexicon & Language Preparation                     
============================================================================
wav-to-duration --read-entire-file=true scp:train_wav.scp ark,t:train_dur.ark 
LOG (wav-to-duration[5.5.613~1-56ee1]:main():wav-to-duration.cc:92) Printed duration for 3696 audio files.
LOG (wav-to-duration[5.5.613~1-56ee1]:main():wav-to-duration.cc:94) Mean duration was 3.06336, min and max durations were 0.91525, 7.78881
wav-to-duration --read-entire-file=true scp:dev_wav.scp ark,t:dev_dur.ark 
LOG (wav-to-duration[5.5.613~1-56ee1]:main():wav-to-duration.cc:92) Printed duration for 400 audio files.
LOG (wav-to-duration[5.5.613~1-56ee1]:main():wav-to-duration.cc:94) Mean duration was 3.08212, min and max durations were 1.09444, 7.43681
wav-to-duration --read-entire-file=true scp:test_wav.scp ark,t:test_dur.ark 
LOG (wav-to-duration[5.5.613~1-56ee1]:main():wav-to-duration.cc:92) Printed duration for 192 audio files.
LOG (wav-to-duration[5.5.613~1-56ee1]:main():wav-to-duration.cc:94) Mean duration was 3.03646, min and max durations were 1.30562, 6.21444
Data preparation succeeded
LOGFILE:/dev/null
$bin/ngt -i="$inpfile" -n=$order -gooout=y -o="$gzip -c > $tmpdir/ngram.${sdict}.gz" -fd="$tmpdir/$sdict" $dictionary $additional_parameters >> $logfile 2>&1
$bin/ngt -i="$inpfile" -n=$order -gooout=y -o="$gzip -c > $tmpdir/ngram.${sdict}.gz" -fd="$tmpdir/$sdict" $dictionary $additional_parameters >> $logfile 2>&1
$scr/build-sublm.pl $verbose $prune $prune_thr_str $smoothing "$additional_smoothing_parameters" --size $order --ngrams "$gunzip -c $tmpdir/ngram.${sdict}.gz" -sublm $tmpdir/lm.$sdict $additional_parameters >> $logfile 2>&1
inpfile: data/local/lm_tmp/lm_phone_bg.ilm.gz
outfile: /dev/stdout
loading up to the LM level 1000 (if any)
dub: 10000000
OOV code is 50
OOV code is 50
Saving in txt format to /dev/stdout
Dictionary & language model preparation succeeded
utils/prepare_lang.sh --sil-prob 0.0 --position-dependent-phones false --num-sil-states 3 data/local/dict sil data/local/lang_tmp data/lang
Checking data/local/dict/silence_phones.txt ...
--> reading data/local/dict/silence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/silence_phones.txt is OK

Checking data/local/dict/optional_silence.txt ...
--> reading data/local/dict/optional_silence.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/optional_silence.txt is OK

Checking data/local/dict/nonsilence_phones.txt ...
--> reading data/local/dict/nonsilence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/nonsilence_phones.txt is OK

Checking disjoint: silence_phones.txt, nonsilence_phones.txt
--> disjoint property is OK.

Checking data/local/dict/lexicon.txt
--> reading data/local/dict/lexicon.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/lexicon.txt is OK

Checking data/local/dict/extra_questions.txt ...
--> reading data/local/dict/extra_questions.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/extra_questions.txt is OK
--> SUCCESS [validating dictionary directory data/local/dict]

**Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int 
prepare_lang.sh: validating output directory
utils/validate_lang.pl data/lang
Checking existence of separator file
separator file data/lang/subword_separator.txt is empty or does not exist, deal in word case.
Checking data/lang/phones.txt ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang/phones.txt is OK

Checking words.txt: #0 ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang/words.txt is OK

Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
--> silence.txt and nonsilence.txt are disjoint
--> silence.txt and disambig.txt are disjoint
--> disambig.txt and nonsilence.txt are disjoint
--> disjoint property is OK

Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
--> found no unexplainable phones in phones.txt

Checking data/lang/phones/context_indep.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.{txt, int, csl} are OK

Checking data/lang/phones/nonsilence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 47 entry/entries in data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.{txt, int, csl} are OK

Checking data/lang/phones/silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/silence.txt
--> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.{txt, int, csl} are OK

Checking data/lang/phones/optional_silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.{txt, int, csl} are OK

Checking data/lang/phones/disambig.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 2 entry/entries in data/lang/phones/disambig.txt
--> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.{txt, int, csl} are OK

Checking data/lang/phones/roots.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 48 entry/entries in data/lang/phones/roots.txt
--> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt
--> data/lang/phones/roots.{txt, int} are OK

Checking data/lang/phones/sets.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 48 entry/entries in data/lang/phones/sets.txt
--> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt
--> data/lang/phones/sets.{txt, int} are OK

Checking data/lang/phones/extra_questions.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 2 entry/entries in data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.{txt, int} are OK

Checking optional_silence.txt ...
--> reading data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.txt is OK

Checking disambiguation symbols: #0 and #1
--> data/lang/phones/disambig.txt has "#0" and "#1"
--> data/lang/phones/disambig.txt is OK

Checking topo ...

Checking word-level disambiguation symbols...
--> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh)
Checking data/lang/oov.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/oov.txt
--> data/lang/oov.int corresponds to data/lang/oov.txt
--> data/lang/oov.{txt, int} are OK

--> data/lang/L.fst is olabel sorted
--> data/lang/L_disambig.fst is olabel sorted
--> SUCCESS [validating lang directory data/lang]
Preparing train, dev and test data
utils/validate_data_dir.sh: Successfully validated data-directory data/train
utils/validate_data_dir.sh: Successfully validated data-directory data/dev
utils/validate_data_dir.sh: Successfully validated data-directory data/test
Preparing language models for test
arpa2fst --disambig-symbol=#0 --read-symbol-table=data/lang_test_bg/words.txt - data/lang_test_bg/G.fst 
LOG (arpa2fst[5.5.613~1-56ee1]:Read():arpa-file-parser.cc:94) Reading \data\ section.
LOG (arpa2fst[5.5.613~1-56ee1]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
LOG (arpa2fst[5.5.613~1-56ee1]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
WARNING (arpa2fst[5.5.613~1-56ee1]:ConsumeNGram():arpa-lm-compiler.cc:313) line 60 [-3.26717	<s> <s>] skipped: n-gram has invalid BOS/EOS placement
LOG (arpa2fst[5.5.613~1-56ee1]:RemoveRedundantStates():arpa-lm-compiler.cc:359) Reduced num-states from 50 to 50
fstisstochastic data/lang_test_bg/G.fst 
0.000510126 -0.0763018
utils/validate_lang.pl data/lang_test_bg
Checking existence of separator file
separator file data/lang_test_bg/subword_separator.txt is empty or does not exist, deal in word case.
Checking data/lang_test_bg/phones.txt ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang_test_bg/phones.txt is OK

Checking words.txt: #0 ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang_test_bg/words.txt is OK

Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
--> silence.txt and nonsilence.txt are disjoint
--> silence.txt and disambig.txt are disjoint
--> disambig.txt and nonsilence.txt are disjoint
--> disjoint property is OK

Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
--> found no unexplainable phones in phones.txt

Checking data/lang_test_bg/phones/context_indep.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang_test_bg/phones/context_indep.txt
--> data/lang_test_bg/phones/context_indep.int corresponds to data/lang_test_bg/phones/context_indep.txt
--> data/lang_test_bg/phones/context_indep.csl corresponds to data/lang_test_bg/phones/context_indep.txt
--> data/lang_test_bg/phones/context_indep.{txt, int, csl} are OK

Checking data/lang_test_bg/phones/nonsilence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 47 entry/entries in data/lang_test_bg/phones/nonsilence.txt
--> data/lang_test_bg/phones/nonsilence.int corresponds to data/lang_test_bg/phones/nonsilence.txt
--> data/lang_test_bg/phones/nonsilence.csl corresponds to data/lang_test_bg/phones/nonsilence.txt
--> data/lang_test_bg/phones/nonsilence.{txt, int, csl} are OK

Checking data/lang_test_bg/phones/silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang_test_bg/phones/silence.txt
--> data/lang_test_bg/phones/silence.int corresponds to data/lang_test_bg/phones/silence.txt
--> data/lang_test_bg/phones/silence.csl corresponds to data/lang_test_bg/phones/silence.txt
--> data/lang_test_bg/phones/silence.{txt, int, csl} are OK

Checking data/lang_test_bg/phones/optional_silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang_test_bg/phones/optional_silence.txt
--> data/lang_test_bg/phones/optional_silence.int corresponds to data/lang_test_bg/phones/optional_silence.txt
--> data/lang_test_bg/phones/optional_silence.csl corresponds to data/lang_test_bg/phones/optional_silence.txt
--> data/lang_test_bg/phones/optional_silence.{txt, int, csl} are OK

Checking data/lang_test_bg/phones/disambig.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 2 entry/entries in data/lang_test_bg/phones/disambig.txt
--> data/lang_test_bg/phones/disambig.int corresponds to data/lang_test_bg/phones/disambig.txt
--> data/lang_test_bg/phones/disambig.csl corresponds to data/lang_test_bg/phones/disambig.txt
--> data/lang_test_bg/phones/disambig.{txt, int, csl} are OK

Checking data/lang_test_bg/phones/roots.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 48 entry/entries in data/lang_test_bg/phones/roots.txt
--> data/lang_test_bg/phones/roots.int corresponds to data/lang_test_bg/phones/roots.txt
--> data/lang_test_bg/phones/roots.{txt, int} are OK

Checking data/lang_test_bg/phones/sets.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 48 entry/entries in data/lang_test_bg/phones/sets.txt
--> data/lang_test_bg/phones/sets.int corresponds to data/lang_test_bg/phones/sets.txt
--> data/lang_test_bg/phones/sets.{txt, int} are OK

Checking data/lang_test_bg/phones/extra_questions.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 2 entry/entries in data/lang_test_bg/phones/extra_questions.txt
--> data/lang_test_bg/phones/extra_questions.int corresponds to data/lang_test_bg/phones/extra_questions.txt
--> data/lang_test_bg/phones/extra_questions.{txt, int} are OK

Checking optional_silence.txt ...
--> reading data/lang_test_bg/phones/optional_silence.txt
--> data/lang_test_bg/phones/optional_silence.txt is OK

Checking disambiguation symbols: #0 and #1
--> data/lang_test_bg/phones/disambig.txt has "#0" and "#1"
--> data/lang_test_bg/phones/disambig.txt is OK

Checking topo ...

Checking word-level disambiguation symbols...
--> data/lang_test_bg/phones/wdisambig.txt exists (newer prepare_lang.sh)
Checking data/lang_test_bg/oov.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang_test_bg/oov.txt
--> data/lang_test_bg/oov.int corresponds to data/lang_test_bg/oov.txt
--> data/lang_test_bg/oov.{txt, int} are OK

--> data/lang_test_bg/L.fst is olabel sorted
--> data/lang_test_bg/L_disambig.fst is olabel sorted
--> data/lang_test_bg/G.fst is ilabel sorted
--> data/lang_test_bg/G.fst has 50 states
fstdeterminizestar data/lang_test_bg/G.fst /dev/null 
--> data/lang_test_bg/G.fst is determinizable
--> utils/lang/check_g_properties.pl successfully validated data/lang_test_bg/G.fst
--> utils/lang/check_g_properties.pl succeeded.
--> Testing determinizability of L_disambig . G
fstdeterminizestar 
fsttablecompose data/lang_test_bg/L_disambig.fst data/lang_test_bg/G.fst 
--> L_disambig . G is determinizable
--> SUCCESS [validating lang directory data/lang_test_bg]
Succeeded in formatting data.
============================================================================
         MFCC Feature Extration & CMVN for Training and Test set          
============================================================================
steps/make_mfcc.sh --cmd run.pl --nj 10 data/train exp/make_mfcc/train mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/train
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
steps/make_mfcc.sh: Succeeded creating MFCC features for train
steps/compute_cmvn_stats.sh data/train exp/make_mfcc/train mfcc
Succeeded creating CMVN stats for train
steps/make_mfcc.sh --cmd run.pl --nj 10 data/dev exp/make_mfcc/dev mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/dev
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
steps/make_mfcc.sh: Succeeded creating MFCC features for dev
steps/compute_cmvn_stats.sh data/dev exp/make_mfcc/dev mfcc
Succeeded creating CMVN stats for dev
steps/make_mfcc.sh --cmd run.pl --nj 10 data/test exp/make_mfcc/test mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/test
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
steps/make_mfcc.sh: Succeeded creating MFCC features for test
steps/compute_cmvn_stats.sh data/test exp/make_mfcc/test mfcc
Succeeded creating CMVN stats for test
============================================================================
                     MonoPhone Training & Decoding                        
============================================================================
steps/train_mono.sh --nj 30 --cmd run.pl data/train data/lang exp/mono
steps/train_mono.sh: Initializing monophone system.
steps/train_mono.sh: Compiling training graphs
steps/train_mono.sh: Aligning data equally (pass 0)
steps/train_mono.sh: Pass 1
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 2
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 3
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 4
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 5
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 6
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 7
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 8
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 9
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 10
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 11
steps/train_mono.sh: Pass 12
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 13
steps/train_mono.sh: Pass 14
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 15
steps/train_mono.sh: Pass 16
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 17
steps/train_mono.sh: Pass 18
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 19
steps/train_mono.sh: Pass 20
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 21
steps/train_mono.sh: Pass 22
steps/train_mono.sh: Pass 23
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 24
steps/train_mono.sh: Pass 25
steps/train_mono.sh: Pass 26
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 27
steps/train_mono.sh: Pass 28
steps/train_mono.sh: Pass 29
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 30
steps/train_mono.sh: Pass 31
steps/train_mono.sh: Pass 32
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 33
steps/train_mono.sh: Pass 34
steps/train_mono.sh: Pass 35
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 36
steps/train_mono.sh: Pass 37
steps/train_mono.sh: Pass 38
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 39
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/mono
steps/diagnostic/analyze_alignments.sh: see stats in exp/mono/log/analyze_alignments.log
2 warnings in exp/mono/log/align.*.*.log
exp/mono: nj=30 align prob=-99.15 over 3.12h [retry=0.0%, fail=0.0%] states=144 gauss=987
steps/train_mono.sh: Done training monophone system in exp/mono
tree-info exp/mono/tree 
tree-info exp/mono/tree 
fsttablecompose data/lang_test_bg/L_disambig.fst data/lang_test_bg/G.fst 
fstdeterminizestar --use-log=true 
fstminimizeencoded 
fstpushspecial 
fstisstochastic data/lang_test_bg/tmp/LG.fst 
-0.00841336 -0.00928521
fstcomposecontext --context-size=1 --central-position=0 --read-disambig-syms=data/lang_test_bg/phones/disambig.int --write-disambig-syms=data/lang_test_bg/tmp/disambig_ilabels_1_0.int data/lang_test_bg/tmp/ilabels_1_0.6152 data/lang_test_bg/tmp/LG.fst 
fstisstochastic data/lang_test_bg/tmp/CLG_1_0.fst 
-0.00841336 -0.00928521
make-h-transducer --disambig-syms-out=exp/mono/graph/disambig_tid.int --transition-scale=1.0 data/lang_test_bg/tmp/ilabels_1_0 exp/mono/tree exp/mono/final.mdl 
fstdeterminizestar --use-log=true 
fstrmsymbols exp/mono/graph/disambig_tid.int 
fsttablecompose exp/mono/graph/Ha.fst data/lang_test_bg/tmp/CLG_1_0.fst 
fstrmepslocal 
fstminimizeencoded 
fstisstochastic exp/mono/graph/HCLGa.fst 
0.000381709 -0.00951555
add-self-loops --self-loop-scale=0.1 --reorder=true exp/mono/final.mdl exp/mono/graph/HCLGa.fst 
steps/decode.sh --nj 5 --cmd run.pl exp/mono/graph data/dev exp/mono/decode_dev
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/mono/graph exp/mono/decode_dev
steps/diagnostic/analyze_lats.sh: see stats in exp/mono/decode_dev/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(5,25,121) and mean=56.6
steps/diagnostic/analyze_lats.sh: see stats in exp/mono/decode_dev/log/analyze_lattice_depth_stats.log
steps/decode.sh --nj 5 --cmd run.pl exp/mono/graph data/test exp/mono/decode_test
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/mono/graph exp/mono/decode_test
steps/diagnostic/analyze_lats.sh: see stats in exp/mono/decode_test/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(6,27,142) and mean=72.3
steps/diagnostic/analyze_lats.sh: see stats in exp/mono/decode_test/log/analyze_lattice_depth_stats.log
============================================================================
           tri1 : Deltas + Delta-Deltas Training & Decoding               
============================================================================
steps/align_si.sh --boost-silence 1.25 --nj 30 --cmd run.pl data/train data/lang exp/mono exp/mono_ali
steps/align_si.sh: feature type is delta
steps/align_si.sh: aligning data in data/train using model from exp/mono, putting alignments in exp/mono_ali
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/mono_ali
steps/diagnostic/analyze_alignments.sh: see stats in exp/mono_ali/log/analyze_alignments.log
steps/align_si.sh: done aligning data.
steps/train_deltas.sh --cmd run.pl 2500 15000 data/train data/lang exp/mono_ali exp/tri1
steps/train_deltas.sh: accumulating tree stats
steps/train_deltas.sh: getting questions for tree-building, via clustering
steps/train_deltas.sh: building the tree
steps/train_deltas.sh: converting alignments from exp/mono_ali to use current tree
steps/train_deltas.sh: compiling graphs of transcripts
steps/train_deltas.sh: training pass 1
steps/train_deltas.sh: training pass 2
steps/train_deltas.sh: training pass 3
steps/train_deltas.sh: training pass 4
steps/train_deltas.sh: training pass 5
steps/train_deltas.sh: training pass 6
steps/train_deltas.sh: training pass 7
steps/train_deltas.sh: training pass 8
steps/train_deltas.sh: training pass 9
steps/train_deltas.sh: training pass 10
steps/train_deltas.sh: aligning data
steps/train_deltas.sh: training pass 11
steps/train_deltas.sh: training pass 12
steps/train_deltas.sh: training pass 13
steps/train_deltas.sh: training pass 14
steps/train_deltas.sh: training pass 15
steps/train_deltas.sh: training pass 16
steps/train_deltas.sh: training pass 17
steps/train_deltas.sh: training pass 18
steps/train_deltas.sh: training pass 19
steps/train_deltas.sh: training pass 20
steps/train_deltas.sh: aligning data
steps/train_deltas.sh: training pass 21
steps/train_deltas.sh: training pass 22
steps/train_deltas.sh: training pass 23
steps/train_deltas.sh: training pass 24
steps/train_deltas.sh: training pass 25
steps/train_deltas.sh: training pass 26
steps/train_deltas.sh: training pass 27
steps/train_deltas.sh: training pass 28
steps/train_deltas.sh: training pass 29
steps/train_deltas.sh: training pass 30
steps/train_deltas.sh: aligning data
steps/train_deltas.sh: training pass 31
steps/train_deltas.sh: training pass 32
steps/train_deltas.sh: training pass 33
steps/train_deltas.sh: training pass 34
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri1
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri1/log/analyze_alignments.log
1 warnings in exp/tri1/log/compile_questions.log
69 warnings in exp/tri1/log/init_model.log
44 warnings in exp/tri1/log/update.*.log
exp/tri1: nj=30 align prob=-95.28 over 3.12h [retry=0.0%, fail=0.0%] states=1888 gauss=15044 tree-impr=5.40
steps/train_deltas.sh: Done training system with delta+delta-delta features in exp/tri1
tree-info exp/tri1/tree 
tree-info exp/tri1/tree 
fstcomposecontext --context-size=3 --central-position=1 --read-disambig-syms=data/lang_test_bg/phones/disambig.int --write-disambig-syms=data/lang_test_bg/tmp/disambig_ilabels_3_1.int data/lang_test_bg/tmp/ilabels_3_1.3130 data/lang_test_bg/tmp/LG.fst 
fstisstochastic data/lang_test_bg/tmp/CLG_3_1.fst 
0 -0.00928518
make-h-transducer --disambig-syms-out=exp/tri1/graph/disambig_tid.int --transition-scale=1.0 data/lang_test_bg/tmp/ilabels_3_1 exp/tri1/tree exp/tri1/final.mdl 
fstdeterminizestar --use-log=true 
fsttablecompose exp/tri1/graph/Ha.fst data/lang_test_bg/tmp/CLG_3_1.fst 
fstminimizeencoded 
fstrmepslocal 
fstrmsymbols exp/tri1/graph/disambig_tid.int 
fstisstochastic exp/tri1/graph/HCLGa.fst 
0.000449687 -0.0175772
HCLGa is not stochastic
add-self-loops --self-loop-scale=0.1 --reorder=true exp/tri1/final.mdl exp/tri1/graph/HCLGa.fst 
steps/decode.sh --nj 5 --cmd run.pl exp/tri1/graph data/dev exp/tri1/decode_dev
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri1/graph exp/tri1/decode_dev
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode_dev/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(3,11,42) and mean=19.0
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode_dev/log/analyze_lattice_depth_stats.log
steps/decode.sh --nj 5 --cmd run.pl exp/tri1/graph data/test exp/tri1/decode_test
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri1/graph exp/tri1/decode_test
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode_test/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(3,12,48) and mean=21.6
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode_test/log/analyze_lattice_depth_stats.log
============================================================================
                 tri2 : LDA + MLLT Training & Decoding                    
============================================================================
steps/align_si.sh --nj 30 --cmd run.pl data/train data/lang exp/tri1 exp/tri1_ali
steps/align_si.sh: feature type is delta
steps/align_si.sh: aligning data in data/train using model from exp/tri1, putting alignments in exp/tri1_ali
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri1_ali
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri1_ali/log/analyze_alignments.log
steps/align_si.sh: done aligning data.
steps/train_lda_mllt.sh --cmd run.pl --splice-opts --left-context=3 --right-context=3 2500 15000 data/train data/lang exp/tri1_ali exp/tri2
steps/train_lda_mllt.sh: Accumulating LDA statistics.
steps/train_lda_mllt.sh: Accumulating tree stats
steps/train_lda_mllt.sh: Getting questions for tree clustering.
steps/train_lda_mllt.sh: Building the tree
steps/train_lda_mllt.sh: Initializing the model
steps/train_lda_mllt.sh: Converting alignments from exp/tri1_ali to use current tree
steps/train_lda_mllt.sh: Compiling graphs of transcripts
Training pass 1
Training pass 2
steps/train_lda_mllt.sh: Estimating MLLT
Training pass 3
Training pass 4
steps/train_lda_mllt.sh: Estimating MLLT
Training pass 5
Training pass 6
steps/train_lda_mllt.sh: Estimating MLLT
Training pass 7
Training pass 8
Training pass 9
Training pass 10
Aligning data
Training pass 11
Training pass 12
steps/train_lda_mllt.sh: Estimating MLLT
Training pass 13
Training pass 14
Training pass 15
Training pass 16
Training pass 17
Training pass 18
Training pass 19
Training pass 20
Aligning data
Training pass 21
Training pass 22
Training pass 23
Training pass 24
Training pass 25
Training pass 26
Training pass 27
Training pass 28
Training pass 29
Training pass 30
Aligning data
Training pass 31
Training pass 32
Training pass 33
Training pass 34
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri2
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri2/log/analyze_alignments.log
96 warnings in exp/tri2/log/init_model.log
1 warnings in exp/tri2/log/compile_questions.log
152 warnings in exp/tri2/log/update.*.log
exp/tri2: nj=30 align prob=-47.90 over 3.12h [retry=0.0%, fail=0.0%] states=2024 gauss=15024 tree-impr=5.58 lda-sum=28.50 mllt:impr,logdet=1.63,2.19
steps/train_lda_mllt.sh: Done training system with LDA+MLLT features in exp/tri2
tree-info exp/tri2/tree 
tree-info exp/tri2/tree 
make-h-transducer --disambig-syms-out=exp/tri2/graph/disambig_tid.int --transition-scale=1.0 data/lang_test_bg/tmp/ilabels_3_1 exp/tri2/tree exp/tri2/final.mdl 
fstdeterminizestar --use-log=true 
fstrmepslocal 
fstminimizeencoded 
fstrmsymbols exp/tri2/graph/disambig_tid.int 
fsttablecompose exp/tri2/graph/Ha.fst data/lang_test_bg/tmp/CLG_3_1.fst 
fstisstochastic exp/tri2/graph/HCLGa.fst 
0.000462607 -0.0175772
HCLGa is not stochastic
add-self-loops --self-loop-scale=0.1 --reorder=true exp/tri2/final.mdl exp/tri2/graph/HCLGa.fst 
steps/decode.sh --nj 5 --cmd run.pl exp/tri2/graph data/dev exp/tri2/decode_dev
decode.sh: feature type is lda
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri2/graph exp/tri2/decode_dev
steps/diagnostic/analyze_lats.sh: see stats in exp/tri2/decode_dev/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(2,8,28) and mean=13.2
steps/diagnostic/analyze_lats.sh: see stats in exp/tri2/decode_dev/log/analyze_lattice_depth_stats.log
steps/decode.sh --nj 5 --cmd run.pl exp/tri2/graph data/test exp/tri2/decode_test
decode.sh: feature type is lda
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri2/graph exp/tri2/decode_test
steps/diagnostic/analyze_lats.sh: see stats in exp/tri2/decode_test/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(2,9,32) and mean=14.3
steps/diagnostic/analyze_lats.sh: see stats in exp/tri2/decode_test/log/analyze_lattice_depth_stats.log
============================================================================
              tri3 : LDA + MLLT + SAT Training & Decoding                 
============================================================================
steps/align_si.sh --nj 30 --cmd run.pl --use-graphs true data/train data/lang exp/tri2 exp/tri2_ali
steps/align_si.sh: feature type is lda
steps/align_si.sh: aligning data in data/train using model from exp/tri2, putting alignments in exp/tri2_ali
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri2_ali
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri2_ali/log/analyze_alignments.log
steps/align_si.sh: done aligning data.
steps/train_sat.sh --cmd run.pl 2500 15000 data/train data/lang exp/tri2_ali exp/tri3
steps/train_sat.sh: feature type is lda
steps/train_sat.sh: obtaining initial fMLLR transforms since not present in exp/tri2_ali
steps/train_sat.sh: Accumulating tree stats
steps/train_sat.sh: Getting questions for tree clustering.
steps/train_sat.sh: Building the tree
steps/train_sat.sh: Initializing the model
steps/train_sat.sh: Converting alignments from exp/tri2_ali to use current tree
steps/train_sat.sh: Compiling graphs of transcripts
Pass 1
Pass 2
Estimating fMLLR transforms
Pass 3
Pass 4
Estimating fMLLR transforms
Pass 5
Pass 6
Estimating fMLLR transforms
Pass 7
Pass 8
Pass 9
Pass 10
Aligning data
Pass 11
Pass 12
Estimating fMLLR transforms
Pass 13
Pass 14
Pass 15
Pass 16
Pass 17
Pass 18
Pass 19
Pass 20
Aligning data
Pass 21
Pass 22
Pass 23
Pass 24
Pass 25
Pass 26
Pass 27
Pass 28
Pass 29
Pass 30
Aligning data
Pass 31
Pass 32
Pass 33
Pass 34
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri3
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri3/log/analyze_alignments.log
1 warnings in exp/tri3/log/est_alimdl.log
38 warnings in exp/tri3/log/init_model.log
1 warnings in exp/tri3/log/compile_questions.log
16 warnings in exp/tri3/log/update.*.log
steps/train_sat.sh: Likelihood evolution:
-50.1967 -49.3093 -49.107 -48.9006 -48.2007 -47.5136 -47.0927 -46.8195 -46.5561 -46.0203 -45.765 -45.4413 -45.2516 -45.1079 -44.9839 -44.8753 -44.7675 -44.6596 -44.5571 -44.3946 -44.2562 -44.168 -44.0857 -44.0061 -43.9282 -43.8526 -43.7788 -43.7062 -43.6342 -43.5389 -43.4641 -43.4386 -43.4224 -43.4104 
exp/tri3: nj=30 align prob=-47.05 over 3.12h [retry=0.0%, fail=0.0%] states=1928 gauss=15017 fmllr-impr=4.07 over 2.79h tree-impr=8.75
steps/train_sat.sh: done training SAT system in exp/tri3
tree-info exp/tri3/tree 
tree-info exp/tri3/tree 
make-h-transducer --disambig-syms-out=exp/tri3/graph/disambig_tid.int --transition-scale=1.0 data/lang_test_bg/tmp/ilabels_3_1 exp/tri3/tree exp/tri3/final.mdl 
fsttablecompose exp/tri3/graph/Ha.fst data/lang_test_bg/tmp/CLG_3_1.fst 
fstdeterminizestar --use-log=true 
fstrmsymbols exp/tri3/graph/disambig_tid.int 
fstminimizeencoded 
fstrmepslocal 
fstisstochastic exp/tri3/graph/HCLGa.fst 
0.000461769 -0.0175772
HCLGa is not stochastic
add-self-loops --self-loop-scale=0.1 --reorder=true exp/tri3/final.mdl exp/tri3/graph/HCLGa.fst 
steps/decode_fmllr.sh --nj 5 --cmd run.pl exp/tri3/graph data/dev exp/tri3/decode_dev
steps/decode.sh --scoring-opts  --num-threads 1 --skip-scoring false --acwt 0.083333 --nj 5 --cmd run.pl --beam 10.0 --model exp/tri3/final.alimdl --max-active 2000 exp/tri3/graph data/dev exp/tri3/decode_dev.si
decode.sh: feature type is lda
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri3/graph exp/tri3/decode_dev.si
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_dev.si/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(2,9,33) and mean=15.3
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_dev.si/log/analyze_lattice_depth_stats.log
steps/decode_fmllr.sh: feature type is lda
steps/decode_fmllr.sh: getting first-pass fMLLR transforms.
steps/decode_fmllr.sh: doing main lattice generation phase
steps/decode_fmllr.sh: estimating fMLLR transforms a second time.
steps/decode_fmllr.sh: doing a final pass of acoustic rescoring.
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri3/graph exp/tri3/decode_dev
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_dev/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(1,5,16) and mean=7.6
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_dev/log/analyze_lattice_depth_stats.log
steps/decode_fmllr.sh --nj 5 --cmd run.pl exp/tri3/graph data/test exp/tri3/decode_test
steps/decode.sh --scoring-opts  --num-threads 1 --skip-scoring false --acwt 0.083333 --nj 5 --cmd run.pl --beam 10.0 --model exp/tri3/final.alimdl --max-active 2000 exp/tri3/graph data/test exp/tri3/decode_test.si
decode.sh: feature type is lda
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri3/graph exp/tri3/decode_test.si
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_test.si/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(2,10,35) and mean=16.2
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_test.si/log/analyze_lattice_depth_stats.log
steps/decode_fmllr.sh: feature type is lda
steps/decode_fmllr.sh: getting first-pass fMLLR transforms.
steps/decode_fmllr.sh: doing main lattice generation phase
steps/decode_fmllr.sh: estimating fMLLR transforms a second time.
steps/decode_fmllr.sh: doing a final pass of acoustic rescoring.
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri3/graph exp/tri3/decode_test
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_test/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(1,5,18) and mean=8.6
steps/diagnostic/analyze_lats.sh: see stats in exp/tri3/decode_test/log/analyze_lattice_depth_stats.log
============================================================================
                        SGMM2 Training & Decoding                         
============================================================================
steps/align_fmllr.sh --nj 30 --cmd run.pl data/train data/lang exp/tri3 exp/tri3_ali
steps/align_fmllr.sh: feature type is lda
steps/align_fmllr.sh: compiling training graphs
steps/align_fmllr.sh: aligning data in data/train using exp/tri3/final.alimdl and speaker-independent features.
steps/align_fmllr.sh: computing fMLLR transforms
steps/align_fmllr.sh: doing final alignment.
steps/align_fmllr.sh: done aligning data.
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri3_ali
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri3_ali/log/analyze_alignments.log

4.参考文档

此篇博客的顺利完成,首先要感谢以下几篇博客对于我的帮助,没有以下几篇博客的帮助,也很难得以成功。再次感谢以下几位大佬提供的帮助。
Kaldi在语音数据库timit上的声学和语音模型训练–1
Ubuntu16.04用kaldi跑timit数据集

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章