驗證了一個端到端語音識別系統,目前cer結果還不錯。考慮怎麼實際使用:
1)服務器端: 利用kaldi的流式處理方法(http
1. kaldi-gstreamer-server
GitHub地址是https://github.com/alumae/kaldi-gstreamer-server,裏面有詳細的安裝步驟,步驟分爲大的三步
1)安裝編譯kaldi、安裝gstreamer、安裝libjansson-dev
2)安裝並編譯gst-kaldi-nnet2-online,git clone https://github.com/alumae/gst-kaldi-nnet2-online.git
3)安裝基於kaldinnet2onlinedecoder解碼器的worker, git clone https://github.com/alumae/kaldi-gstreamer-server
2. master_server.py
3. woker.py
4. client.py
rnn 結構
代碼:python3.6/site-packages/torch/nn/modules/rnn.py
裏面有八個類:
class | 描述 |
---|---|
class RNNBase(Module) | {tanh}(w_{ih} x_t + b_{ih} + w_{hh} h_{(t-1)} + b_{hh}) |
class RNN(RNNBase) | Applies a multi-layer Elman RNN with :math:tanh or :math:ReLU non-linearity to an input sequence |
class LSTM(RNNBase) | Applies a multi-layer long short-term memory (LSTM) RNN to an input sequence |
class GRU(RNNBase) | Applies a multi-layer gated recurrent unit (GRU) RNN to an input sequence |
class RNN(RNNBase) | 323 |
class RNNCellBase(Module) | |
class RNNCell(RNNCellBase) | An Elman RNN cell with tanh or ReLU non-linearity |
class LSTMCell(RNNCellBase) | A long short-term memory (LSTM) cell |
class GRUCell(RNNCellBase) | A gated recurrent unit (GRU) cell |