深度學習在酒店售後智能問答場景實踐

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"1、項目背景"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"去哪兒網作爲全球領先的旅遊搜索引擎,每天有成千上萬的用戶在這裏買到了 低價的機票、酒店等產品,這其中有着龐大的客服團隊在背後支持着售後服務工作,用戶可以隨時隨地通過電話或者 chat 找到客服解決行中和行後的問題。隨着人工智能在各個領域的應用,客服領域也有了很多落地場景,比如售後智能問答、智能IVR、智能問題挖掘 、智能質檢等,提高了客服的效率,節約了人力成本,提升了用戶體驗,本文主要介紹酒店售後智能問答的應用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"酒店售後場景中,這裏指的都是chat渠道,我們將用戶常問的問題整理成了標準問題 FAQ(Frequently Asked Questions)的 形式,總共五百多標準問題,這些問題會對應很多不同的問題分類。通過分析用戶歷史來看,大部分的用戶問題都可以通過機器自助完成的,比如是否可以開發票、查看退款進度等問題,有一部分是需要客服通過和酒店溝通後才能解決,比如規則外退款、到酒店入住不了等問題。售後智能問答主要解決這些機器可以自助解決的問題,同時對不能自助解決的要能及時轉到人工服務,避免給用戶帶來不好的體驗,功能如圖1所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/30\/309b72a76fd1fa8f4457585ddc96b766.png","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下邊簡單介紹大概處理流程:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先對用戶輸入問題進行意圖識別,判斷是否是閒聊還是問題諮詢還是必須人工介入,然後轉接到不同的模塊處理,如圖2所示。我們需要準確理解用戶的問題,然後給出對應答案和操作,如果是複雜的問題或者用戶對答案不滿意可以喚起人工服務,目前平均對話3.6輪,24小時自助率大概在77%左右。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/3c\/3ccc6dba6aee07a5ef388baf76ddf12c.png","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"智能問答,涉及到自然語言理解、意圖識別、QA 算法、多輪會話管理等任務。其中,QA(Question Answering)任務是比較基礎和核心的模塊,本文主要圍繞QA 算法,詳細介紹基於 QA 我們在深度學習方法的一些嘗試。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"2、技術選型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們先來回顧下 QA 任務的定義:給定一個用戶問題q,我們需要從知識庫中查詢出來 top k 個最相關的答案{a1,a2,…,ak},只要有一個回答 ak 在列表裏,我們就說回答正確,否則回答錯誤。從這個定義來看,我們很容易想到的方法是基於分類和檢索的方法。在本文中的知識庫特指 FAQ,即通過運營整理出來的有限個標準問題,不是指問題對應的答案。因此,基於分類的方法,我們可以把每個 FAQ 當做一個分類,可以基於 SVM、FastText 等做多分類預測,標記樣本時候,我們需要標記每個樣本屬於哪個分類。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是,基於分類算法有很多弊端,首先由於 FAQ 問題很多,都當做小的類別的話,會有很多類樣本分佈很不均衡,很多類別學習不充分,平均準確率上不去。解決的辦法也有很多,比如可以只挑選高頻即樣本多的類進行識別,當分類閾值很高時候才返回,其他識別不了的場景走檢索的方式返回等。其次,分類問題的類別必須事先確定的,如果知識庫的問題有增刪改時候,就得重新訓練,重新標記樣本。如果知識庫是通用型不怎麼變還行,酒店業務複雜多變,產品和運營經常會對知識庫進行增刪改,因此,分類方法在酒店業務知識問題場景不太適用。所以,我們考慮採用基於檢索的方式,比如基於 TF-IDF 或者 BM25的文檔相關性算法,或者基於其他深度學習的短文本匹配算法。由於深度學習的方法目前效果較好,我們在考慮選型時主要考慮了幾種基於深度學習的文本匹配算法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"DSSM[1]可以說是深度學習在文本匹配的先驅了,網絡結構如圖3所示,最下邊兩層得到 embedding,接着過三層全連接層提取特徵,接着計算查詢Q和D的特徵餘弦相似度,計算 softmax 得到後驗概率,損失是似然損失,最大化點擊樣本的概率。DSSM還有很多變種,比如CNN-DSSM[2]、LSTM-DSSM[3]、MV-DSSM[4]等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/9d\/9d8694aeef453d991e1cbb470d2bc499.webp","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"阿里的 ESIM[5],基於問題和答案動態編碼,給定個前提 p 推導出假設 h,損失是判斷 p 和 h 是否有關聯。論文提出了兩種結構,如圖4所示,左邊是 ESIM,右邊是基於語法結構的 HIM。最底下的Input Encoding部分,將p和h進行embedding 輸入到雙向 LSTM,得到 encoding,接着是核心的 Local Inference Modeling部分,將上一步得到的特徵基於attention機制計算出加權的encoding,然後做一些對應位相減、相乘等操作,和原始特徵拼到一起。最後是Inference Composition模塊,把剛纔的值再一次輸入雙向 LSTM,接着池化拼接、全連接,最後接 softmax 層。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於上述的語義匹配模型,正常使用方式是來一個查詢請求,要去和庫裏所有知識庫問題匹配計算一遍,計算開銷太大了,滿足不了線上需求。對於 DSSM,我們可以提前把知識庫標準問題過一遍網絡 inference,抽取最後一層輸出作爲句子特徵存儲起來,線上只要用同樣方式把用戶問題變成特徵向量,和庫裏的向量計算一遍餘弦相似度就能快速找到 top k 個最相似問題;對於 ESIM,由於需要基於查詢問題和庫裏問題組合動態編碼,不能進行提前編碼。可以先訓練個簡單的比如Siamese 網絡先提取 top n個候選問題,比如50個,再從這些候選問題基於 ESIM選出top k個最終結果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/13\/13d4d4ecfff6697d0f40d8fd05a5868a.webp","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,我們希望有一種端到端的學習模型,不用做各種傳統 NLP 的處理工作,能提前計算好知識庫問題的特徵,線上只用計算輸入問題的特徵,並且對於知識庫新增或者修改,模型有比較好的魯棒性,不用重新訓練也能準確表徵新問題的句子特徵。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們參考了Minwei Feng [6]的方法,在 CNN 前接入了一層雙向LSTM,可以獲取句子前後的信息,此模型作爲我們的 baseline 方法。用戶輸入Q,A+是知識庫標記爲正確的樣本,A-是知識庫其他隨機取的一個樣本,先計算得到embedding,然後輸入到雙向 LSTM,得到句子進一步表示,接着輸入到 CNN 網絡,再過一層 pooling 層,得到向量表示,分別計算出 Q 和正負樣本的餘弦相似度,計算損失L= max{0,m-cos(VQ, VA+) + cos(VQ, VA-)},更新網絡參數。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/c3\/c318223327b906320bf251a80190f04f.webp","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"近幾年,預訓練(Pre-training)模型在自然語言處理領域得到了廣泛應用,其中,最重要的就是 Google 的BERT[7]預訓練模型,基於Transformer網絡結構再大規模無監督語料進行預訓練,在下游的不同NLP任務進行微調(Fine-tuning),在11項自然語言處理任務中得到了不錯的結果。預訓練和微調除了輸出層,其他網絡結構都一樣,微調就是基於預訓練的網絡參數進行初始化,如圖6所示。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/b0\/b0da3feea48269abf35749cb24390a44.webp","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"BERT的fine-tune主要支持以下四種任務:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1)基於句子對的分類"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2)基於單個句子的分類任務"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(3)問答任務,類似於閱讀理解,是從paragraph裏面選擇一個最可能的回答"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(4)命名實體識別"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"前文已經討論過,我們QA任務的場景,FAQ的類型多,變動大,不適合當做固定類別的分類任務來做,所以傾向於去對句子對做分類。下面我們評測了下fine-tune後BERT的效果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/d6\/d6c8d0fcef332fb173490849ac3cfaeb.webp","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"3、評測效果"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了比較上述模型在酒店業務場景的效果,我們基於同一份標準語料按照一定比例劃分了訓練集、驗證集和測試集,訓練集對樣本較少的類別進行了上採樣處理,驗證集和測試集的類別分佈相對均勻,消除因爲某些類別佔比太多導致平均準確率太高或者太低的影響。返回結果分別評估了top1、top3、top5的準確率,比如P@3表示返回的top3個問題的準確率,top裏只要有一個正確,這次結果就算對的。比較結果如表1所示,base指上文介紹的baseline模型,base+BERT表示基於base選出來top10粗排,再用BERT精排,這裏BERT兩次微調,第一次是和其他模型一樣訓練集上基於公開中文base模型上微調,第二次是在驗證集上先用base跑出來top10,把分錯樣本收集起來,進一步微調。ESIM不知道是實現方式有點問題,還是負樣本得特殊構建,因爲基於pair動態編碼,庫裏其他好多問題沒有變成負樣本訓練過,基於概率值倒排結果和預想中不一致。對於新出現類別的魯棒性測試,我們新增了三百個知識庫FAQ,人工標記了一些樣本,不重新訓練模型,top3準確率baseline和BERT差不多,84%左右,DSSM低了4個百分點,這些模型對新問題都有比較好的魯棒性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/60\/60fa9d83b08868b5020433156673353f.webp","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"表1的準確率實驗數據可以看出,baseline + BERT效果相對是最好的,比baseline高出10個百分點,top1召回提升26%,top5召回提升11%,缺點是預測速度過慢,ALBERT理論上應該速度更快效果更好,還需要調優。DSSM效果整體和baseline差不多,基於DSSM粗排+BERT\/ALBERT精排效果待優化。考慮目前的業務場景,需要在baseline粗排得到top10之後進行精排,挨個通過BERT網絡計算query和候選集對。從表2的速度對比可以看到,inference時間在batch size=10時候,CPU上base 20 ms左右,BERT需要800多ms,GPU需要160多ms,響應時間過長線上沒法直接使用,我們需要進一步優化。比如並行計算batch size=1的預測,CPU平臺也需要200ms以上,而業務最多能忍受的時間在100ms之內。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ef\/efe667ad5ddd4b070ce851fe94ae2f0a.webp","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"4、工程優化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於 GPU,可以選擇 TensorRT 或者 Faster Transformer 等辦法,但是由於我們資源比較少,還是希望在 CPU 上優化。這個方向也已經有很多研究了,比如知乎的cuBERT[8],微軟[9]的基於 oonx 運行時能提升17倍,但是我們嘗試後,好像沒有達到預想中的提升。找 cuBERT作者諮詢後嘗試了他們重寫 cuda 接口的效果,比原生系統下降了100ms,但是對於我們大batch size的使用方式還是不夠快。我們也嘗試過打算簡化網絡模型犧牲點精度,比如減少 BERT 層數、重寫 tf 的預測邏輯、蒸餾出一個小的student模型等方法[10],以及瘦身版的albert-tiny[11],損失的精度沒有預期那麼好。業界還有很多優化方法,比如 facebook 的模型參數壓縮Quant-Noise[12]方法,RoBERTa 模型從480M壓縮到了14M,精度沒怎麼降。詳細數據在表2中可以看到,最後這些方法均沒有采用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/6e\/6e6d52bffb930586adce8aafd1fe81b3.webp","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於 BERT 沒有直接提供句子編碼的方式,如果直接提取 BERT 的輸出層取平均或者 CLS token 作爲句子的 fixed embedding 作爲特徵,然後計算句子相似度效果很差,論文[13]中也提到了這個問題,因爲優化的目標不一樣。基於 baseline 思路,我們準備在 BERT 後接一層損失變換一下來更新BERT參數,剛好 sentence-bert[13]實現了這個思想。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/81\/8195fae963f1a71607158e7372d18b28.webp","alt":"Image","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"STS-BERT 網絡結構很簡單,如圖8所示,BERT 後接了 pooling 層提取句子特徵,基於 siamese 網絡供共享 BERT 參數,最後根據業務可以接入分類或者回歸目標函數,反傳更新參數,這樣特徵層提取出來就有意義,可以用來結算餘弦相似度,標準 FAQ 可以預計算提取特徵,線上只用 batch size=1計算 query 的特徵,然後和 FAQ 的用簡單的方法比較相似度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們基於 Triplet 損失,embedding 取了768維,在同樣的訓練集上訓練,top 的準確率都分別比 base 提升了5個百分點,平均時間也只需要49ms 左右,但是比最好的 base+bert 還差好幾個點,未來還需要進一步優化。同時,我們也和論文一樣嘗試過直接使用 bert embedding的方式做比較,相同場景下用bert embedding+svm 平均準確率只有77%,下降了十幾個點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"5、總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於非常大體量的互聯網公司來說,加機器和加人可能是一個萬能藥,但是對於大多數其他公司來說,在生產上直接應用目前越來越深的模型需要付出巨大的資源消耗和優化的研發成本。所以咱們這些公司做深度學習的落地應用,感覺回到了在SOC上開發軟件的時代,帶着鎖鏈跳舞,凡事需要權衡實現方案和代價的tradeoff,對於現在習慣於動輒橫向擴展加機器的技術人員來說也是一種有趣的復古考驗。本文沒有特別深入的技術改造和算法優化,只是在實際工作中的一些體驗和經驗,希望能給大家帶來一點收穫。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[1] Po-Sen Huang, Xiaodong He, Jianfeng Gao, et al. Learning deep structured semantic models for web search using clickthrough data. CIKM. 2013."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[2] Yelong Shen, Xiaodong He, Jianfeng Gao, et al. A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval. CIKM. 2014."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[3] H. Palangi, L. Deng, Y. Shen, et al. Semantic Modelling with Long-Short-Term Memory for Information Retrieval. arXiv preprint arXiv:1412.6629, 2014."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[4] A. M. Elkahky, Y. Song, and X. He. A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems. www. 2015."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[5] Qian Chen, Xiao-Dan Zhu, Zhen-Hua Ling, et al. Enhanced lstm for natural language inference. ACL. 2017."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[6] Minwei Feng, Bing Xiang, Michael R. Glass, et al. Applying Deep Learning to Answer Selection: A Study and An Open Task. ASRU. 2015."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[7] Jacob Devlin, Ming-Wei Chang, Kenton Lee, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805, 2018."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[8] Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL. https:\/\/github.com\/zhihu\/cuBERT."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[9] Microsoft open sources breakthrough optimizations for transformer inference on GPU and CPU. https:\/\/cloudblogs.microsoft.com\/opensource\/2020\/01\/21\/microsoft-onnx-open-source-optimizations-transformer-inference-gpu-cpu."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[10] Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, and Qun Liu. TinyBERT: Distilling BERT for natural language understanding. arXiv preprint arXiv:1909.10351, 2019. 34"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[11] Zhenzhong Lan, Mingda Chen, Sebastian Goodman, et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. arXiv preprint arXiv:1909.11942, 2019."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[12] Angela Fan, Pierre Stock, Benjamin Graham, et al. Training with Quantization Noise for Extreme Model Compression. arXiv preprint arXiv:2004.07320, 2020."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[13] Nils Reimers, Iryna Gurevych. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. EMNLP. 2019."}]},{"type":"horizontalrule"},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"頭圖:Unsplash"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作者:李兆海 胡智"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文:https:\/\/mp.weixin.qq.com\/s\/PPUh13SVvk3R0Tn0DgxWlQ"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文:深度學習在酒店售後智能問答場景實踐"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"來源:Qunar技術沙龍 - 微信公衆號 [ID:QunarTL]"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"轉載:著作權歸作者所有。商業轉載請聯繫作者獲得授權,非商業轉載請註明出處。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章