KWS 派系類別

原創: [email protected]
時間: 2020/05/19

之前查看論文得方法有誤,鄙人還沾沾自喜,引以爲戒,paper非經典或必看切勿通篇閱讀
parper 閱讀手冊:你的Paper閱讀能力合格了嗎

一、LVCSR

  • The LVCSR based systems need to generate rich lattices and high computational resources.

二、HMM-GMM

三、語音到文本

3.1 DNN

  • 2014 | Small-footprint keyword spotting using deep neural networks
  • arXiv:1709.03665 | attention | DNN + CTC
    • 不需要幀對齊,需要full-sized encoders
  • arXiv:1812.02802 | Streaming + End-to-End
    • Singular value decomposition SVD,奇異值分解 減小模型
  • 2019 | Multitask Learning of Deep Neural Network Based Keyword Spotting for IoT Devices | DNN + HMM
  • 2019 | Time-Delayed Bottleneck Highway Networks Using a DFT Feature for Keyword Spotting

3.2 CNN | 考慮時間和頻率得相關性

  • 2015 | Convolutional neural networks for small-footprint keyword spotting
  • arXiv:1907.01448 | Sub-band CNN
  • 2019 | A Small-Footprint End-to-End KWS System in Low Resources | CTC + End-to-End
  • arXiv:1811.07684 | Dilated Convolutions | end-to-end

3.3 CRNN

3.4 RNN

3.5 Other

四、Query-by-Example

  • LSTM

    • 2015 | Query-by-example keyword spotting using long short-term memory networks
  • arXiv:1811.10736 | DONUT | CTC | 後驗概率

  • RNN-T with attention

    • arXiv:1710.09617 | Squence-to-sequence | End-to-End | keyword/filler
      • 不需要幀對齊,需要full-sized encoders
  • arXiv:1910.05171 | query enrollment and testing | user-specific queries

五、Other

5.1 Optimization model

5.2 Loss

  • CTC loss

    • 2006 | Connection-ist temporal classifification: labelling unsegmented sequence data with recurrent neural networks
  • Max-pooling loss

5.3 Dataset

5.3.1 Enhance

  • 2019 | Adversarial Examples for Improving End-to-end Attention-based Small-footprint Keyword Spotting (End-to-End)

5.3.2 Small dataset

  • 2018 | Fast asr-free and almost zero-resource keyword spotting using dtw and cnns for humanitarian monitoring
    • use DTW to augment the data
  • 2019 | Meta learning for few-shot keyword spotting
    • suggests
      a few-shot meta-learning approach

5.4 Other

  • arXiv:2005.03633 | Far-field
  • 2019 | Improving keyword spotting and language identifification via neural architecture search at scale
  • 2017 | Hey Siri: An On-device DNN-powered Voice Trigger for Apple’s Personal Assistan
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章