【IJCAI2019】Learning to Select Knowledge for Response Generation in Dialog Systems

原創

2020-06-13 01:41

p7 in 2019/12/13

論文名稱：Learning to Select Knowledge for Response Generation in Dialog Systems
… … … ：
論文作者：Rongzhong Lian, Min Xie, Fan Wang, Jinhua Peng, Hua Wu
論文來源：IJCAI2019
下載鏈接：https://arxiv.org/abs/1902.04911?context=cs.CL
源碼鏈接：https://github.com/ifr2/PostKS(未完成)
參考筆記：https://mp.weixin.qq.com/s/dqv95t-1o-1ewojIKP8jPg

本文則更側重於知識特徵，強調利用後驗知識分佈優化對話生成。

Abstract

以前的方法：簡單用先驗選擇的知識用於訓練過程。
以前的不足：知識不匹配回覆（response），生成模型無法學會去正確地利用知識，回覆的效果就會下降。
本文的方法：分割訓練和預測時知識的選擇過程，利用知識的先驗分佈和後驗分佈來促進知識的選擇，對知識的後驗分配是由訓練過程中的響應和結果推斷出來的。同時，利用先驗分佈近似後驗分佈，使推理過程中即使沒有響應也能選擇合適的知識邊緣。
本文的貢獻：1）清晰地闡述了基於知識的對話生成中，先驗分佈和後驗分佈在知識上的差異；2）提出了一種知識選擇機制：將後驗分佈與先驗分佈分離的神經模型。

Model

圖1 模型的示意圖

模型由4個部分組成：

the utterance encoder：上文獨立編碼表示成向量。
the knowledge encoder：知識獨立編碼表示成向量。
the knowledge manager：知識管理，包含兩個子模塊：先驗知識模塊（PriorKS）和後驗知識模塊（PostKS）。
PriorKS，input：上文；output：對於knowledge的attention。
PostKS，input：上文+response；output：對於knowledge的attention。
the decoder：基於知識選擇和上文的注意力機制生成響應

2.2 Encoder

utterance encoder和knowledge encoder都是採用同樣的結構（BiGRU），但是她們之間不共享參數。

2.3 Knowledge Manager

圖2 知識管理和損失函數

PriorKS：基於知識的條件概率分佈p(k|x)：

PostKS：基於知識的後驗分佈p(k|x,y)

MLP(.)是全連接層。 Gumbel-Softmax再參數化，對知識採樣。
KLDivLoss：（Kullback-Leibler divergence loss），該loss是先驗選擇和後驗選擇之間的KL-Divergence, 目標是爲了拉近先驗選擇和後驗選擇的距離，使得先驗訓練更準確。

2.4 Decoder

標準GRU的“硬”解碼器：
帶有分層門控融合單元的“軟”解碼器：

2.5 Loss Function

NLL Loss： 用於訓練生成網絡的常用Loss。
BOW Loss： 利用選擇的知識來預測response的Bag-of-words的表示向量。這個Loss能夠有效地加速後驗選擇的訓練速度，借鑑了CVAE中用於加速recognition network訓練的思路。

Experiments

數據集： PersonChat和Wizard of Wikipedia
評估指標： BLEU、Distinct、Knowledge-F1
基準模型： Seq2Seq、MemNet、Lost In Conversation(LIC)
表2 模型人工和自動評測指標

表3 具體Case展示

表4 PostKS在LIC模型上的表現

References

[Dinan et al., 2018] Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, and Jason Weston. Wizard of wikipedia: Knowledge-powered conversational agents. arXiv preprint arXiv:1811.01241, 2018.
[Dinan et al., 2019] Emily Dinan, Varvara Logacheva, Valentin Malykh, Alexander Miller, Kurt Shuster, Jack Urbanek, Douwe Kiela, Arthur Szlam, Iulian Serban, Ryan Lowe, et al. The second conversational intelligence challenge. arXiv preprint arXiv:1902.00098, 2019.
[Ghazvininejad et al., 2018] Marjan Ghazvininejad, Chris Brockett, Ming-Wei Chang, Bill Dolan, Jianfeng Gao, Wen-tau Yih, and Michel Galley. A knowledge-grounded neural conversation model. In Thirty-Second AAAI Conference on Artiﬁcial Intelligence, 2018.

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【IJCAI2019】Learning to Select Knowledge for Response Generation in Dialog Systems

Abstract

Model

2.2 Encoder

2.3 Knowledge Manager

2.4 Decoder

2.5 Loss Function

Experiments

References

SQL優化-20231016

【AAAI2019】Exploring Answer Stance Detection with Recurrent Conditional Attention

【讀】文本摘要—（1）Faithful to the Original: Fact Aware Neural Abstractive Summarization

win，各種python包的安裝歷程……

【轉】IEEE論文免費下載

使用tree命令導出文件夾/文件的目錄樹

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結