中文版GPT-3來了?智源研究院發佈清源 CPM —— 以中文爲核心的大規模預訓練模型

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"語言模型是指對自然語言文本進行概率建模的模型,它不僅可以估計任意一個給定文本序列的概率,也可以用來預測文本序列中某個位置上詞的出現概率,是自然語言處理中最基本的問題。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/27\/89\/278c5679782209cea2ec966998ac5089.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2018年以來,預訓練語言模型 (Pretrained Langauge Model, PLM) 的研究風起雲湧。與此前有監督學習範式不同的是,預訓練語言模型能夠充分利用大規模的無標註數據學習通用的語言模型,然後再使用下游任務的少量有標註數據進行模型微調。與直接訓練具體任務模型相比,在預訓練語言模型基礎上微調得到的模型在自然語言處理各大任務上均取得了顯著的性能提升。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/49\/f9\/49250cc1ee7249fd88f260beac50eaf9.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 GPU 多機多卡並行算力和海量無標註文本數據的雙重支持下,預訓練模型實現了參數規模與性能齊飛的局面,取得了人工智能和深度學習領域的革命性突破。國際著名互聯網企業和研究機構互相競爭,將模型規模和性能不斷推向新的高度。BERT之後,短短兩年時間,最新發布的 GPT-3 已經達到1750億參數規模、上萬塊 GPU 的驚人訓練規模。在人工智能與深度學習領域圍繞超大規模預訓練模型展開的“軍備競賽”日益白熱化,成爲對海量數據、並行計算、模型學習能力的全方位考驗。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/cb\/b9\/cb5ac16d1290846bd9e82824e543f2b9.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"預訓練模型規模以平均每年10倍的速度增長 (最後一列計算時間爲使用單塊NVIDIA V100 GPU訓練的估計時間。M-百萬,B-十億)"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/bb\/fa\/bb47721bbde0506961d9fe6bdc0d59fa.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"預訓練模型研究發展圖"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章