我個人是需要TIMIT數據集的MFCC特徵來做語音識別網絡,所以是做到了MFCC跑通,後面的運行結果我會慢慢貼上來
一、TIMIT數據集
此數據集非開源,不便傳播,可以去下載
數據集路徑爲:/path/to/TIMIT/
數據結構爲:
$ ls
DOC README.DOC TEST TRAIN
$ cd TRAIN/
$ ls
DR1 DR2 DR3 DR4 DR5 DR6 DR7 DR8
$ cd DR1
$ ls #各文件夾下是不同的.wav語音文件以及標籤
FCJF0 FETB0 FSAH0 FVFB0 MDPK0 MJWT0 MMRP0 MRAI0 MRWS0 MWAD0
FDAW0 FJSP0 FSJK1 FVMH0 MEDR0 MKLS0 MPGH0 MRCG0 MTJS0 MWAR0
FDML0 FKFB0 FSMA0 MCPM0 MGRL0 MKLW0 MPGR0 MRDD0 MTPF0
FECD0 FMEM0 FTBR0 MDAC0 MJEB1 MMGG0 MPSW0 MRSO0 MTRR0
eg:
TIMIT/TRAIN/DR1/FCJF0$ ls
SA1.PHN SA2.WAV SI1657.PHN SI648.WAV SX217.PHN SX307.WAV SX397.PHN
SA1.TXT SA2.WRD SI1657.TXT SI648.WRD SX217.TXT SX307.WRD SX397.TXT
SA1.WAV SI1027.PHN SI1657.WAV SX127.PHN SX217.WAV SX37.PHN SX397.WAV
SA1.WRD SI1027.TXT SI1657.WRD SX127.TXT SX217.WRD SX37.TXT SX397.WRD
SA2.PHN SI1027.WAV SI648.PHN SX127.WAV SX307.PHN SX37.WAV
SA2.TXT SI1027.WRD SI648.TXT SX127.WRD SX307.TXT SX37.WRD
$ cd ../../DOC
$ ls
PHONCODE.DOC SPKRINFO.TXT TESTSET.DOC TIMITDIC.TXT
PROMPTS.TXT SPKRSENT.TXT TIMITDIC.DOC
二、安裝kaldi工具包
可參考此博客,親測可用,在此不贅述。
當完成了yesno小例程的實驗成功之後就證明安裝沒問題了,可以開始下一步~
三、跑TIMIT數據集
1、將下載好的語料庫解壓,把TIMIT文件夾放到kaldi-trunk/egs/timit目錄下,然後文件夾可以設置爲data,這樣路徑就是kaldi-trunk/egs/timit/data/TIMIT
2、修改/timit/s5目錄下run.sh中的文件路徑,改爲指向自己數據集的路徑:
echo ============================================================================
echo " Data & Lexicon & Language Preparation "
echo ============================================================================
#timit=/export/corpora5/LDC/LDC93S1/timit/TIMIT # @JHU
#timit=/mnt/matylda2/data/TIMIT/timit # @BUT
timit=/path/to/kaldi-trunk/egs/timit/data/TIMIT
3、修改運行環境,/timit/s5中的cmd.sh,去掉c)的註釋,改動一下:
#c) run locally...
export train_cmd=run.pl
export decode_cmd=run.pl
export cuda_cmd=run.pl
export mkgraph_cmd=run.pl
4、在s5/下開始跑數據(在這之後運行過程中如果遇到錯誤類似於找不到xxx.sh文件,就在egs文件夾下搜索一下,然後複製到對應的路徑即可(一般是s5/local路徑))
先直接運行./run.sh
如果不出錯,恭喜你,可以跳過以下這些步驟了。
如果出錯了,也沒關係,因爲我也是一步步debug出來的hhh(然後當我做到MFCC特徵提取階段我發現可能是因爲我之前沒有修改好cmd.sh,暴風哭泣,希望大家都直接成功吧TTTTTT)
①
$ sudo local/timit_data_prep.sh /path/to/kalid-trunk/egs/timit/data/TIMIT
wav-to-duration scp:train_wav.scp ark,t:train_dur.ark
LOG (wav-to-duration:main():wav-to-duration.cc:68) Printed duration for 3696 audio files.
LOG (wav-to-duration:main():wav-to-duration.cc:70) Mean duration was 3.06336, min and max durations were 0.91525, 7.78881
wav-to-duration scp:dev_wav.scp ark,t:dev_dur.ark
LOG (wav-to-duration:main():wav-to-duration.cc:68) Printed duration for 400 audio files.
LOG (wav-to-duration:main():wav-to-duration.cc:70) Mean duration was 3.08212, min and max durations were 1.09444, 7.43681
wav-to-duration scp:test_wav.scp ark,t:test_dur.ark
LOG (wav-to-duration:main():wav-to-duration.cc:68) Printed duration for 192 audio files.
LOG (wav-to-duration:main():wav-to-duration.cc:70) Mean duration was 3.03646, min and max durations were 1.30562, 6.21444
Data preparation succeede
$ local/timit_train_lms.sh data/local
Transcript file data/local/train_trans.txt not found. Did you run local/timit_data_prep.sh
這個問題是由於我的timit_data_prep.sh跑出來的文件保存爲了train.trans文件,用cp命令處理一下
$ sudo cp train_trans.txt ../train_trans.txt
②$ sudo local/timit_train_lms.sh data/local
local/timit_train_lms.sh: line 95: local/create_biphone_lm.sh: No such file or director
就按照我上面說的,從egs文件夾下搜索一下,複製到required路徑裏,再編譯
$ sudo local/timit_train_lms.sh data/local
Not installing the kaldi_lm toolkit since it is already there.
Creating phones file, and monophone lexicon (mapping phones to itself).
local/timit_train_lms.sh: line 81: local/get_word_map.pl: No such file or directory
Creating biphone model
Training biphone language model in folder data/local/lm
Creating directory data/local/lm/biphone
Getting raw N-gram counts
get_raw_ngrams: get_raw_ngrams.cc:29: void process_line(char*, int): Assertion `!isspace(*line)' failed.
get_raw_ngrams: get_raw_ngrams.cc:29: void process_line(char*, int): Assertion `!isspace(*line)' failed.
Iteration 1/7 of optimizing discounting parameters
discount_ngrams: for n-gram order 1, D=0.400000, tau=0.675000 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.600000, tau=0.675000 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.800000, tau=0.825000 phi=2.000000
Error: empty wordlist (from data/local/lm/wordlist.mapped)
Error: empty wordlist (from data/local/lm/wordlist.mapped)
Perplexity over 0.000000 words is -nan
Perplexity over 0.000000 words (excluding 0.000000 OOVs) is -nan
real 0m0.004s
user 0m0.006s
sys 0m0.000s
discount_ngrams: for n-gram order 1, D=0.400000, tau=0.900000 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.600000, tau=0.900000 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.800000, tau=1.100000 phi=2.000000
discount_ngrams: for n-gram order 1, D=0.400000, tau=1.215000 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.600000, tau=1.215000 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.800000, tau=1.485000 phi=2.000000
Error: empty wordlist (from data/local/lm/wordlist.mapped)
Perplexity over 0.000000 words is -nan
Perplexity over 0.000000 words (excluding 0.000000 OOVs) is -nan
real 0m0.005s
user 0m0.005s
sys 0m0.001s
Perplexity over 0.000000 words is -nan
Perplexity over 0.000000 words (excluding 0.000000 OOVs) is -nan
real 0m0.003s
user 0m0.006s
sys 0m0.000s
Bad perplexities . at /home/chutz/kalid-trunk/egs/timit/s5/../../../tools/kaldi_lm/optimize_alpha.pl line 30.
這個需要刪除s5/data/local/lm/biphone下的ngrams_disc.gz和ngrams.gz,再編譯(中間結果我就不放出來了,反正迭代7次沒有問題就是成功了!)
$ sudo local/timit_train_lms.sh data/local
Not installing the kaldi_lm toolkit since it is already there.
Creating phones file, and monophone lexicon (mapping phones to itself).
Creating biphone model
Training biphone language model in folder data/local/lm
Creating directory data/local/lm/biphone
Getting raw N-gram counts
Iteration 1/7 of optimizing discounting parameters
…………………………
Iteration 7/7 of optimizing discounting parameters
…………………………
Perplexity over 11412.000000 words (excluding 600.000000 OOVs) is 16.298151
15.283423
③
$ sudo local/timit_prepare_dict.sh
/home/chutz/kalid-trunk/egs/timit/s5/../../../tools/irstlm/bin//build-lm.sh
Temporary directory stat_4824 does not exist
creating stat_4824
Extracting dictionary from training corpus
Splitting dictionary into 3 lists
Extracting n-gram statistics for each word list
Important: dictionary must be ordered according to order of appearance of words in data
used to generate n-gram blocks, so that sub language model blocks results ordered too
dict.000
dict.001
dict.002
$bin/ngt -i="$inpfile" -n=$order -gooout=y -o="$gzip -c > $tmpdir/ngram.${sdict}.gz" -fd="$tmpdir/$sdict" $dictionary -iknstat="$tmpdir/ikn.stat.$sdict" >> $logfile 2>&1
$bin/ngt -i="$inpfile" -n=$order -gooout=y -o="$gzip -c > $tmpdir/ngram.${sdict}.gz" -fd="$tmpdir/$sdict" $dictionary -iknstat="$tmpdir/ikn.stat.$sdict" >> $logfile 2>&1
Estimating language models for each word list
dict.000
dict.001
dict.002
$scr/build-sublm.pl $verbose $prune $smoothing --size $order --ngrams "$gunzip -c $tmpdir/ngram.${sdict}.gz" -sublm $tmpdir/lm.$sdict >> $logfile 2>&1
Merging language models into data/local/lm_tmp/lm_phone_bg.ilm.gz
Cleaning temporary directory stat_4824
Removing temporary directory stat_4824
inpfile: data/local/lm_tmp/lm_phone_bg.ilm.gz
outfile: /dev/stdout
loading up to the LM level 1000 (if any)
dub: 10000000
Language Model Type of data/local/lm_tmp/lm_phone_bg.ilm.gz is 1
Language Model Type is 1
iARPA
loadtxt_ram()
1-grams: reading 51 entries
done level 1
2-grams: reading 1694 entries
done level 2
done
OOV code is 50
OOV code is 50
OOV code is 50
Saving in txt format to /dev/stdout
savetxt: /dev/stdout
save: 51 1-grams
save: 1694 2-grams
done
Dictionary & language model preparation succeeded
④
$ sudo utils/prepare_lang.sh --sil-prob 0.0 --position-dependent-phones false --num-sil-states 3 \
> data/local/dict "sil" data/local/lang_tmp data/lang
Checking data/local/dict/silence_phones.txt ...
--> reading data/local/dict/silence_phones.txt
--> data/local/dict/silence_phones.txt is OK
Checking data/local/dict/optional_silence.txt ...
--> reading data/local/dict/optional_silence.txt
--> data/local/dict/optional_silence.txt is OK
Checking data/local/dict/nonsilence_phones.txt ...
--> reading data/local/dict/nonsilence_phones.txt
--> data/local/dict/nonsilence_phones.txt is OK
Checking disjoint: silence_phones.txt, nonsilence_phones.txt
--> disjoint property is OK.
Checking data/local/dict/lexicon.txt
--> reading data/local/dict/lexicon.txt
--> data/local/dict/lexicon.txt is OK
Checking data/local/dict/extra_questions.txt ...
--> reading data/local/dict/extra_questions.txt
--> data/local/dict/extra_questions.txt is OK
--> SUCCESS [validating dictionary directory data/local/dict]
**Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
fstaddselfloops 'echo 49 |' 'echo 49 |'
prepare_lang.sh: validating output directory
Checking data/lang/phones.txt ...
--> data/lang/phones.txt is OK
Checking words.txt: #0 ...
--> data/lang/words.txt has "#0"
--> data/lang/words.txt is OK
Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
--> silence.txt and nonsilence.txt are disjoint
--> silence.txt and disambig.txt are disjoint
--> disambig.txt and nonsilence.txt are disjoint
--> disjoint property is OK
Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
--> summation property is OK
Checking data/lang/phones/context_indep.{txt, int, csl} ...
--> 1 entry/entries in data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.{txt, int, csl} are OK
Checking data/lang/phones/disambig.{txt, int, csl} ...
--> 2 entry/entries in data/lang/phones/disambig.txt
--> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.{txt, int, csl} are OK
Checking data/lang/phones/nonsilence.{txt, int, csl} ...
--> 47 entry/entries in data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.{txt, int, csl} are OK
Checking data/lang/phones/silence.{txt, int, csl} ...
--> 1 entry/entries in data/lang/phones/silence.txt
--> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.{txt, int, csl} are OK
Checking data/lang/phones/optional_silence.{txt, int, csl} ...
--> 1 entry/entries in data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.{txt, int, csl} are OK
Checking data/lang/phones/roots.{txt, int} ...
--> 48 entry/entries in data/lang/phones/roots.txt
--> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt
--> data/lang/phones/roots.{txt, int} are OK
Checking data/lang/phones/sets.{txt, int} ...
--> 48 entry/entries in data/lang/phones/sets.txt
--> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt
--> data/lang/phones/sets.{txt, int} are OK
Checking data/lang/phones/extra_questions.{txt, int} ...
--> 2 entry/entries in data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.{txt, int} are OK
Checking optional_silence.txt ...
--> reading data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.txt is OK
Checking disambiguation symbols: #0 and #1
--> data/lang/phones/disambig.txt has "#0" and "#1"
--> data/lang/phones/disambig.txt is OK
Checking topo ...
--> data/lang/topo's nonsilence section is OK
--> data/lang/topo's silence section is OK
--> data/lang/topo is OK
Checking data/lang/oov.{txt, int} ...
--> 1 entry/entries in data/lang/oov.txt
--> data/lang/oov.int corresponds to data/lang/oov.txt
--> data/lang/oov.{txt, int} are OK
--> data/lang/L.fst is olabel sorted
--> data/lang/L_disambig.fst is olabel sorted
--> SUCCESS [validating lang directory data/lang]
⑤$ sudo local/timit_format_data.sh
Preparing train, dev and test data
utils/validate_data_dir.sh: Successfully validated data-directory data/train
utils/validate_data_dir.sh: Successfully validated data-directory data/dev
utils/validate_data_dir.sh: Successfully validated data-directory data/test
Preparing language models for test
arpa2fst -
Processing 1-grams
Processing 2-grams
Connected 0 states without outgoing arcs.
fstisstochastic data/lang_test_bg/G.fst
0.0003667 -0.0763019
Checking data/lang_test_bg/phones.txt ...
--> data/lang_test_bg/phones.txt is OK
Checking words.txt: #0 ...
--> data/lang_test_bg/words.txt has "#0"
--> data/lang_test_bg/words.txt is OK
Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
--> silence.txt and nonsilence.txt are disjoint
--> silence.txt and disambig.txt are disjoint
--> disambig.txt and nonsilence.txt are disjoint
--> disjoint property is OK
Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
--> summation property is OK
Checking data/lang_test_bg/phones/context_indep.{txt, int, csl} ...
--> 1 entry/entries in data/lang_test_bg/phones/context_indep.txt
--> data/lang_test_bg/phones/context_indep.int corresponds to data/lang_test_bg/phones/context_indep.txt
--> data/lang_test_bg/phones/context_indep.csl corresponds to data/lang_test_bg/phones/context_indep.txt
--> data/lang_test_bg/phones/context_indep.{txt, int, csl} are OK
Checking data/lang_test_bg/phones/disambig.{txt, int, csl} ...
--> 2 entry/entries in data/lang_test_bg/phones/disambig.txt
--> data/lang_test_bg/phones/disambig.int corresponds to data/lang_test_bg/phones/disambig.txt
--> data/lang_test_bg/phones/disambig.csl corresponds to data/lang_test_bg/phones/disambig.txt
--> data/lang_test_bg/phones/disambig.{txt, int, csl} are OK
Checking data/lang_test_bg/phones/nonsilence.{txt, int, csl} ...
--> 47 entry/entries in data/lang_test_bg/phones/nonsilence.txt
--> data/lang_test_bg/phones/nonsilence.int corresponds to data/lang_test_bg/phones/nonsilence.txt
--> data/lang_test_bg/phones/nonsilence.csl corresponds to data/lang_test_bg/phones/nonsilence.txt
--> data/lang_test_bg/phones/nonsilence.{txt, int, csl} are OK
Checking data/lang_test_bg/phones/silence.{txt, int, csl} ...
--> 1 entry/entries in data/lang_test_bg/phones/silence.txt
--> data/lang_test_bg/phones/silence.int corresponds to data/lang_test_bg/phones/silence.txt
--> data/lang_test_bg/phones/silence.csl corresponds to data/lang_test_bg/phones/silence.txt
--> data/lang_test_bg/phones/silence.{txt, int, csl} are OK
Checking data/lang_test_bg/phones/optional_silence.{txt, int, csl} ...
--> 1 entry/entries in data/lang_test_bg/phones/optional_silence.txt
--> data/lang_test_bg/phones/optional_silence.int corresponds to data/lang_test_bg/phones/optional_silence.txt
--> data/lang_test_bg/phones/optional_silence.csl corresponds to data/lang_test_bg/phones/optional_silence.txt
--> data/lang_test_bg/phones/optional_silence.{txt, int, csl} are OK
Checking data/lang_test_bg/phones/roots.{txt, int} ...
--> 48 entry/entries in data/lang_test_bg/phones/roots.txt
--> data/lang_test_bg/phones/roots.int corresponds to data/lang_test_bg/phones/roots.txt
--> data/lang_test_bg/phones/roots.{txt, int} are OK
Checking data/lang_test_bg/phones/sets.{txt, int} ...
--> 48 entry/entries in data/lang_test_bg/phones/sets.txt
--> data/lang_test_bg/phones/sets.int corresponds to data/lang_test_bg/phones/sets.txt
--> data/lang_test_bg/phones/sets.{txt, int} are OK
Checking data/lang_test_bg/phones/extra_questions.{txt, int} ...
--> 2 entry/entries in data/lang_test_bg/phones/extra_questions.txt
--> data/lang_test_bg/phones/extra_questions.int corresponds to data/lang_test_bg/phones/extra_questions.txt
--> data/lang_test_bg/phones/extra_questions.{txt, int} are OK
Checking optional_silence.txt ...
--> reading data/lang_test_bg/phones/optional_silence.txt
--> data/lang_test_bg/phones/optional_silence.txt is OK
Checking disambiguation symbols: #0 and #1
--> data/lang_test_bg/phones/disambig.txt has "#0" and "#1"
--> data/lang_test_bg/phones/disambig.txt is OK
Checking topo ...
--> data/lang_test_bg/topo's nonsilence section is OK
--> data/lang_test_bg/topo's silence section is OK
--> data/lang_test_bg/topo is OK
Checking data/lang_test_bg/oov.{txt, int} ...
--> 1 entry/entries in data/lang_test_bg/oov.txt
--> data/lang_test_bg/oov.int corresponds to data/lang_test_bg/oov.txt
--> data/lang_test_bg/oov.{txt, int} are OK
--> data/lang_test_bg/L.fst is olabel sorted
--> data/lang_test_bg/L_disambig.fst is olabel sorted
--> data/lang_test_bg/G.fst is ilabel sorted
--> data/lang_test_bg/G.fst has 50 states
fstdeterminizestar data/lang_test_bg/G.fst /dev/null
--> data/lang_test_bg/G.fst is determinizable
--> G.fst did not contain cycles with only disambig symbols or epsilon on the input, and did not contain
the forbidden symbols <s> or </s> (if present in vocab) on the input or output.
--> Testing determinizability of L_disambig . G
fstdeterminizestar
fsttablecompose data/lang_test_bg/L_disambig.fst data/lang_test_bg/G.fst
--> L_disambig . G is determinizable
--> SUCCESS [validating lang directory data/lang_test_bg]
Succeeded in formatting data.
至此,5步都成功後,再運行./run.sh文件,應該可以了
我的運行結果:
============================================================================
MFCC Feature Extration & CMVN for Training and Test set
============================================================================
steps/make_mfcc.sh --cmd run.pl --nj 10 data/train exp/make_mfcc/train mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/train
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC features for train
steps/compute_cmvn_stats.sh data/train exp/make_mfcc/train mfcc
Succeeded creating CMVN stats for train
steps/make_mfcc.sh --cmd run.pl --nj 10 data/dev exp/make_mfcc/dev mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/dev
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC features for dev
steps/compute_cmvn_stats.sh data/dev exp/make_mfcc/dev mfcc
Succeeded creating CMVN stats for dev
steps/make_mfcc.sh --cmd run.pl --nj 10 data/test exp/make_mfcc/test mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/test
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC features for test
steps/compute_cmvn_stats.sh data/test exp/make_mfcc/test mfcc
Succeeded creating CMVN stats for test
============================================================================
MonoPhone Training & Decoding
============================================================================
steps/train_mono.sh --nj 30 --cmd run.pl data/train data/lang exp/mono
steps/train_mono.sh: Initializing monophone system.
steps/train_mono.sh: Compiling training graphs
steps/train_mono.sh: Aligning data equally (pass 0)
steps/train_mono.sh: Pass 1
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 2
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 3
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 4
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 5
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 6
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 7
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 8
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 9
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 10
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 11
steps/train_mono.sh: Pass 12
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 13
steps/train_mono.sh: Pass 14
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 15
steps/train_mono.sh: Pass 16
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 17
steps/train_mono.sh: Pass 18
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 19
steps/train_mono.sh: Pass 20
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 21
steps/train_mono.sh: Pass 22
steps/train_mono.sh: Pass 23
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 24
steps/train_mono.sh: Pass 25
steps/train_mono.sh: Pass 26
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 27
steps/train_mono.sh: Pass 28
steps/train_mono.sh: Pass 29
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 30
steps/train_mono.sh: Pass 31
steps/train_mono.sh: Pass 32
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 33
steps/train_mono.sh: Pass 34
steps/train_mono.sh: Pass 35
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 36
steps/train_mono.sh: Pass 37
steps/train_mono.sh: Pass 38
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 39
2 warnings in exp/mono/log/align.*.*.log
Done
fstminimizeencoded
fsttablecompose data/lang_test_bg/L_disambig.fst data/lang_test_bg/G.fst
fstdeterminizestar --use-log=true
fstisstochastic data/lang_test_bg/tmp/LG.fst
0.000361025 -0.0763603
[info]: LG not stochastic.
fstcomposecontext --context-size=1 --central-position=0 --read-disambig-syms=data/lang_test_bg/phones/disambig.int --write-disambig-syms=data/lang_test_bg/tmp/disambig_ilabels_1_0.int data/lang_test_bg/tmp/ilabels_1_0
fstisstochastic data/lang_test_bg/tmp/CLG_1_0.fst
0.000360913 -0.0763603
[info]: CLG not stochastic.
make-h-transducer --disambig-syms-out=exp/mono/graph/disambig_tid.int --transition-scale=1.0 data/lang_test_bg/tmp/ilabels_1_0 exp/mono/tree exp/mono/final.mdl
fstrmsymbols exp/mono/graph/disambig_tid.int
fsttablecompose exp/mono/graph/Ha.fst data/lang_test_bg/tmp/CLG_1_0.fst
fstdeterminizestar --use-log=true
fstrmepslocal
fstminimizeencoded
fstisstochastic exp/mono/graph/HCLGa.fst
0.00039086 -0.0758928
HCLGa is not stochastic
add-self-loops --self-loop-scale=0.1 --reorder=true exp/mono/final.mdl
steps/decode.sh --nj 5 --cmd run.pl exp/mono/graph data/dev exp/mono/decode_dev
decode.sh: feature type is delta
steps/decode.sh --nj 5 --cmd run.pl exp/mono/graph data/test exp/mono/decode_test
decode.sh: feature type is delta
============================================================================
tri1 : Deltas + Delta-Deltas Training & Decoding
============================================================================
steps/align_si.sh --boost-silence 1.25 --nj 30 --cmd run.pl data/train data/lang exp/mono exp/mono_ali
steps/align_si.sh: feature type is delta
steps/align_si.sh: aligning data in data/train using model from exp/mono, putting alignments in exp/mono_ali
steps/align_si.sh: done aligning data.
steps/train_deltas.sh --cmd run.pl 2500 15000 data/train data/lang exp/mono_ali exp/tri1
steps/train_deltas.sh: accumulating tree stats
steps/train_deltas.sh: getting questions for tree-building, via clustering
steps/train_deltas.sh: building the tree
steps/train_deltas.sh: converting alignments from exp/mono_ali to use current tree
steps/train_deltas.sh: compiling graphs of transcripts
steps/train_deltas.sh: training pass 1
steps/train_deltas.sh: training pass 2
steps/train_deltas.sh: training pass 3
steps/train_deltas.sh: training pass 4
steps/train_deltas.sh: training pass 5
steps/train_deltas.sh: training pass 6
steps/train_deltas.sh: training pass 7
steps/train_deltas.sh: training pass 8
steps/train_deltas.sh: training pass 9
steps/train_deltas.sh: training pass 10
steps/train_deltas.sh: aligning data
steps/train_deltas.sh: training pass 11
steps/train_deltas.sh: training pass 12
steps/train_deltas.sh: training pass 13
steps/train_deltas.sh: training pass 14
steps/train_deltas.sh: training pass 15
steps/train_deltas.sh: training pass 16
steps/train_deltas.sh: training pass 17
steps/train_deltas.sh: training pass 18
steps/train_deltas.sh: training pass 19
steps/train_deltas.sh: training pass 20
steps/train_deltas.sh: aligning data
steps/train_deltas.sh: training pass 21
steps/train_deltas.sh: training pass 22
steps/train_deltas.sh: training pass 23
steps/train_deltas.sh: training pass 24
steps/train_deltas.sh: training pass 25
steps/train_deltas.sh: training pass 26
steps/train_deltas.sh: training pass 27
steps/train_deltas.sh: training pass 28
steps/train_deltas.sh: training pass 29
steps/train_deltas.sh: training pass 30
steps/train_deltas.sh: aligning data
steps/train_deltas.sh: training pass 31
steps/train_deltas.sh: training pass 32
steps/train_deltas.sh: training pass 33
steps/train_deltas.sh: training pass 34
1 warnings in exp/tri1/log/compile_questions.log
69 warnings in exp/tri1/log/init_model.log
43 warnings in exp/tri1/log/update.*.log
steps/train_deltas.sh: Done training system with delta+delta-delta features in exp/tri1
fstcomposecontext --context-size=3 --central-position=1 --read-disambig-syms=data/lang_test_bg/phones/disambig.int --write-disambig-syms=data/lang_test_bg/tmp/disambig_ilabels_3_1.int data/lang_test_bg/tmp/ilabels_3_1
fstisstochastic data/lang_test_bg/tmp/CLG_3_1.fst
0.000361405 -0.0763602
[info]: CLG not stochastic.
make-h-transducer --disambig-syms-out=exp/tri1/graph/disambig_tid.int --transition-scale=1.0 data/lang_test_bg/tmp/ilabels_3_1 exp/tri1/tree exp/tri1/final.mdl
fstrmsymbols exp/tri1/graph/disambig_tid.int
fsttablecompose exp/tri1/graph/Ha.fst data/lang_test_bg/tmp/CLG_3_1.fst
fstrmepslocal
fstdeterminizestar --use-log=true
fstminimizeencoded
fstisstochastic exp/tri1/graph/HCLGa.fst
0.000847995 -0.0761719
HCLGa is not stochastic
add-self-loops --self-loop-scale=0.1 --reorder=true exp/tri1/final.mdl
steps/decode.sh --nj 5 --cmd run.pl exp/tri1/graph data/dev exp/tri1/decode_dev
decode.sh: feature type is delta
steps/decode.sh --nj 5 --cmd run.pl exp/tri1/graph data/test exp/tri1/decode_test
decode.sh: feature type is delta
run.sh中各部分的內容說明可以參考此博客