端到端asr系統搭建

驗證了一個端到端語音識別系統,目前cer結果還不錯。考慮怎麼實際使用:
1)服務器端: 利用kaldi的流式處理方法(http

1. kaldi-gstreamer-server

GitHub地址是https://github.com/alumae/kaldi-gstreamer-server,裏面有詳細的安裝步驟,步驟分爲大的三步
1)安裝編譯kaldi、安裝gstreamer、安裝libjansson-dev
2)安裝並編譯gst-kaldi-nnet2-online,git clone https://github.com/alumae/gst-kaldi-nnet2-online.git
3)安裝基於kaldinnet2onlinedecoder解碼器的worker, git clone https://github.com/alumae/kaldi-gstreamer-server

2. master_server.py

3. woker.py

4. client.py

rnn 結構
代碼:python3.6/site-packages/torch/nn/modules/rnn.py
裏面有八個類:

class 描述
class RNNBase(Module) {tanh}(w_{ih} x_t + b_{ih} + w_{hh} h_{(t-1)} + b_{hh})
class RNN(RNNBase) Applies a multi-layer Elman RNN with :math:tanh or :math:ReLU non-linearity to an input sequence
class LSTM(RNNBase) Applies a multi-layer long short-term memory (LSTM) RNN to an input sequence
class GRU(RNNBase) Applies a multi-layer gated recurrent unit (GRU) RNN to an input sequence
class RNN(RNNBase) 323
class RNNCellBase(Module)
class RNNCell(RNNCellBase) An Elman RNN cell with tanh or ReLU non-linearity
class LSTMCell(RNNCellBase) A long short-term memory (LSTM) cell
class GRUCell(RNNCellBase) A gated recurrent unit (GRU) cell
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章