小米在預訓練模型的探索與優化

原創

2020-12-28 08:03

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"導讀：預訓練模型在NLP大放異彩，並開啓了預訓練-微調的NLP範式時代。由於工業領域相關業務的複雜性，以及工業應用對推理性能的要求，大規模預訓練模型往往不能簡單直接地被應用於NLP業務中。本文將爲大家帶來小米在預訓練模型的探索與優化。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"預訓練簡介"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/eb\/53\/eb241bcf1afcf2a58156b5c9e0482f53.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"預訓練與詞向量的方法一脈相承。詞向量是從任務無關和大量的無監督語料中學習到詞的分佈式表達，即文檔中詞的向量化表達。在得到詞向量之後，一般會輸入到下游任務中，進行後續的計算，從而得到任務相關的模型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是，詞向量的學習方法存在一個問題：不能對文檔中的上下文進行建模，對於上面的例子“蘋果”在兩個句子中的表達意思是不一樣的，而詞向量的表達卻是同一個，所以在表達能力的多樣性上會有侷限，這是一種靜態的Word Embedding。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在後面的發展中，有了根據上下文建模的Word Embedding，比如，可以在學習上嘗試使用雙向LSTM模型，在非監督語料學習詞向量，這比靜態的詞向量網絡會複雜一些，最後可以通過隱層得到動態的詞向量輸入到下游任務中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"1. 序列建模方法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/8e\/0f\/8e3359495cd5cb9bfd27e857188b1a0f.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在NLP中，一般使用序列建模的方法。之前比較常用的序列建模是LSTM遞歸神經網絡，其問題是建模時，句子中兩個遠距離詞之間的交互是間接的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"17年Transformer發佈之後，在NLP任務中取得了很大的提升。這裏面Self-Attention可以對任意詞語間進行直接的交互，Multi-head Attention可以表達在不同類型的進行語義交互。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"2. 預訓練模型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/41\/44\/41894539yydfc9bfae644f5f35f6e444.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

相關文章

字節“跳”不動，傳今日頭條處在虧損邊緣；釘釘試行員工每月在家辦公一天；元宇宙人才已開到百萬年薪 | AI一週資訊

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockq

2021-11-21 13:43:53

這個吸引大廠先後加入的神祕組織究竟是何方神聖？

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-11-18 15:13:48

小米彭力：知識圖譜的技術突破口是要針對特定場景做優化

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-10-15 14:03:53

雷軍“揭祕”最艱難的選擇，小米立下三年拿全球第一的目標

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragr

2021-08-11 03:33:56

Twitch表情中的情緒分析

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

Martin Anderson

2021-12-07 16:00:03

達摩院AliceMind上新！首箇中文表格預訓練模型發佈，已向業界開源

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-12-02 18:18:58

在元宇宙裏怎麼交朋友？Meta發佈跨語種交流語音模型，支持128種語言無障礙對話

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null

2021-11-23 14:03:53

人工智能時代，如何硬核玩音樂？| InfoQ《大咖說》

直播內容：在人工智能技術迅速發展的當下，越來越多的領域被這項技術注入新的活力。作爲多媒體領域中不可缺少的組成部分，音樂對於人類的重要性不言而喻。值得一提的是，人工智能在音樂領域的研究早在多年前就已經開始了，並且也落地了很多成熟應用。當前

InfoQ 中文站

2021-11-12 14:23:49

不是隻有數字化水平高，纔可以落地知識圖譜

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockq

2021-11-11 15:23:53

騰訊發佈超大預訓練系統派大星，聚焦解決BERT等超大模型訓練時的“GPU內存牆”問題

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-11-02 13:38:53

微軟和英偉達推出訓練語言模型MT-NLG：5300億參數量，是GPT-3的3倍

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null

2021-10-12 14:13:53

谷歌推出Translatotron 2，一種沒有深度僞造潛力的語音到語音直接翻譯神經模型

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-09-10 14:09:01

放心，GPT-3不會“殺死”編程

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragr

2021-09-03 17:58:55

爲什麼神經網絡不適合理解自然語言？

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-08-04 16:13:54

易聊科技宣佈在線客服系統IM永久免費，透視智能客服的商業化潛力

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockq

2021-07-27 17:33:49

24小時熱門文章

最新文章

最新評論文章