全球首個知識增強千億大模型來了!2600億參數,代碼將在近期開源

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2600億參數,在60多項任務上取得最好效果,全球首個知識增強千億大模型背後技術細節解讀。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"預訓練大模型是現階段AI領域的研究重點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"AI大模型就像“發電廠”,能夠以數據作爲“燃料”,轉化成智能能力,並驅動所有的AI 應用。因此,大模型被認爲是下一代AI基礎平臺。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"未來,可能將是AI大模型的時代。這幾年,國內外很多企業和學術機構競相推出自己的大模型,尤其是國產化大模型研發工作進展飛速。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"12月8日,InfoQ獲悉,百度與鵬城自然語言處理聯合實驗室發佈了鵬城-百度·文心(模型版本號:ERNIE 3.0 Titan),國產化大模型陣營再添一軍。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/09\/18\/09b8c7e8fbcc4bcb8709a10931d48918.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"鵬城-百度·文心是全球首個知識增強的千億AI大模型。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"2600億參數,解析全球首箇中文單體模型技術細節"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"發佈現場,中國工程院院士、鵬城實驗室主任高文表示,“預訓練模型對整個科學的發展、社會的發展、創新的發展來說,都是非常重要的工具。運用這個工具,可以幫助做很多人工智能的賦能,不侷限於某個領域,這是人工智能的發展的福音。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"百度首席技術官、深度學習技術及應用國家工程實驗室主任王海峯介紹,百度知識增強大模型從大規模知識和海量數據中融合學習,效率更高,效果更好,具有良好的可解釋性。從2019年3月發佈文心ERNIE 1.0,到最新的產業級知識增強大模型文心全景圖,既包含基礎通用的大模型,也包含面向重點領域、重點任務的大模型,以及豐富的工具與平臺,有助於促進技術創新和產業發展。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/fe\/b9\/fee6c74b65034a3b7c7f8c2ce43835b9.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於鵬城實驗室算力系統“鵬城雲腦Ⅱ”和百度飛槳深度學習平臺聯合,鵬城-百度·文心模型參數規模超越GPT-3,達到2600億。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"鵬城-百度·文心模型希望解決傳統AI模型泛化性差、強依賴於昂貴的人工標註數據、落地成本高等應用難題,降低AI開發和應用的門檻。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"參數規模較 GPT-3 提升50%"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"鵬城-百度·文心基於百度知識增強大模型ERNIE 3.0全新升級,模型參數規模達到2600億,相對GPT-3的參數量提升了50%。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在算法框架上,該模型沿襲了ERNIE 3.0的海量無監督文本與大規模知識圖譜的平行預訓練算法,模型結構上使用兼顧語言理解與語言生成的統一預訓練框架。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲提升模型語言理解與生成能力,研究團隊進一步設計了可控和可信學習算法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"訓練方面,結合百度飛槳自適應大規模分佈式訓練技術和“鵬城雲腦Ⅱ”算力系統,解決了超大模型訓練中多個公認的技術難題。在應用上,首創大模型在線蒸餾技術,大幅降低了大模型落地成本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/21\/f4\/219eca22fe7494127a939ca92d7d88f4.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"\u001f鵬城-百度·文心模型結構圖"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"可控和可信學習算法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在算法設計上,爲進一步提升模型語言理解能力以及寫小說、歌詞、詩歌、對聯等文學創作能力,研究團隊提出了可控學習和可信學習算法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"具體在可控學習方面,通過將模型預測出的文本屬性和原始文本進行拼接,構造從指定屬性生成對應文本的預訓練數據,模型通過對該數據的學習,實現不同類型的零樣本生成能力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"用戶可以將指定的體裁、情感、長度、主題、關鍵詞等屬性自由組合,無需標註任何樣本,便可生成不同類型的文本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在可信學習方面,針對模型生成結果與真實世界的事實一致性問題,鵬城-百度·文心通過自監督的對抗訓練,讓模型學習區分數據是真實的還是模型僞造的,使得模型對生成結果真實性具備判斷能力,從而讓模型可以從多個候選中選擇最可靠的生成結果,提升了生成結果的可信度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/fb\/8a\/fbf5da28f3fyy9ffef3bdbbee81b2e8a.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"高可信的可控生成預訓練"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"千億模型背後的強大AI算力支撐"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"鵬城-百度·文心基於百度百舸集羣初始化,並基於“鵬城雲腦II”高性能集羣訓練。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"“鵬城雲腦Ⅱ”由鵬城實驗室聯合國內優勢科研力量研發的算力集羣,是我國首個國產E級AI算力平臺。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"“鵬城雲腦Ⅱ” 運行的是鵬城實驗室自主研發的分佈式AI操作系統,連接廣州超算和合肥中科大類腦計算平臺,已實現異地資源共享與統一服務。“鵬城雲腦Ⅱ” 採用搭載鯤鵬、昇騰處理器的Atlas 900集羣,提供充沛算力,其雲腦平臺關鍵技術由鵬城實驗室開發。何爲“雲腦”?簡而言之,雲腦就是一臺既算得快,又能支持人工智能計算的超高速計算機。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"去年年底,鵬城雲腦Ⅱ基本型發佈,正式開啓1000P級雲腦,啓動時的算力爲100P OPS(每秒十億億次計算)。據悉,今年年底,鵬城雲腦Ⅱ基本型將邁入1000P OPS(每秒百億億次計算)算力規模。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1000P算力是什麼概念?相當於52萬臺家用電腦的算力之和,當今世界上最強的超級計算機算力約爲235P,這也意味着,到時候鵬城雲腦Ⅱ的算力將超過現今世界最強大的超級計算機。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於從事人工智能研發的企業或研究機構而言,鵬城雲腦Ⅱ都將是“利器”。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此前,高文院士曾介紹,在鵬城雲腦上,科研人員既可以開展新一代人工智能基礎理論、核心算法、高端芯片、關鍵設備、操作系統的研究;另一方面基於鵬城雲腦提供的算力,可實現在城市交通、醫療健康、金融風控、智能製造等領域的AI賦能,打造新一代人工智能開源開放創新平臺。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據顯示,最近兩年,“鵬城雲腦”多次在相關領域國際權威競賽中獲得佳績,如在今年5月,“鵬城雲腦Ⅱ”在“MLPerf training V1.0”基準測試中獲得自然語言處理領域模型性能第一名和圖像處理領域模型性能第二名。基於“鵬城雲腦”智能計算性能和軟硬件系統協同水平方面的支撐,鵬城-百度·文心在訓練性能上表現不俗。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"採用飛槳深度學習框架進行分佈式訓練和推理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"超大規模模型的訓練和推理給深度學習框架帶來很大考驗,需要利用大規模集羣分佈式計算才能在可接受時間內完成訓練或推理的計算要求,同時面臨着模型參數量單機無法加載、多機通信負載重、並行效率低等難題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"今年4月,百度深度學習框架飛槳發佈了4D混合並行技術,可支持千億參數模型的高效分佈式訓練。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但鵬城-百度·文心的訓練任務給飛槳帶來了新挑戰:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一方面,鵬城-百度·文心的模型結構設計引入諸多小形狀的張量計算,導致層間計算量差異較大,流水線負載不均衡;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一方面,“鵬城雲腦II”的自有軟件棧需要深度學習框架高效深度適配,才能充分發揮其集羣的領先算力優勢。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對以上挑戰,並綜合考慮當前主流硬件、模型的特點與發展趨勢,飛槳設計並研發了具備更強擴展能力的端到端自適應大規模分佈式訓練架構(論文鏈接:"},{"type":"link","attrs":{"href":"https:\/\/arxiv.org\/abs\/2112.02752","title":"","type":null},"content":[{"type":"text","text":"https:\/\/arxiv.org\/abs\/2112.02752"}]},{"type":"text","text":")。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"該架構可以針對不同的模型和硬件,抽象成統一的分佈式計算視圖和資源視圖,並通過硬件感知細粒度切分和映射功能,搜索出最優的模型切分和硬件組合策略,將模型參數、梯度、優化狀態按照最優策略分配到不同的計算卡上,達到節省存儲、負載均衡、提升訓練性能的目的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"經過檢驗,飛槳自適應大規模分佈式訓練架構使得鵬城-百度·文心的訓練性能是傳統分佈式訓練方法2.1倍,並行效率高達90%。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"超大規模的模型訓練不穩定一直上業界難題。針對這個問題,飛槳設計了容錯功能可以在不中斷訓練的情況下自動替換故障機器,加強模型訓練的魯棒性,提高模型訓練的穩定性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在推理方面,飛槳基於服務化部署框架Paddle Serving,通過多機多卡的張量模型並行、流水線並行等一系列優化技術,獲得最佳配比和最優吞吐。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過統一內存尋址(Unified Memory)、算子融合、模型IO優化、量化加速等方式,讓鵬城-百度·文心的推理速度持續提升。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/ab\/21\/ab06663b6dae5d78ace4c89a8b1b3b21.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"\u001f飛槳超大模型訓練與推理"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"模型代碼近期將開源"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自2019年誕生至今,文心ERNIE在語言理解、文本生成、跨模態語義理解等領域取得多項技術突破,多次公開權威語義評測中獲得冠軍。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本着“開源開放”的理念,該模型代碼近期會在OpenI啓智社區開源,依託鵬城雲腦Ⅱ對外開放,積極聯合“產學研協”各方,充分挖掘AI大模型的賦能能力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目前,百度文心通過百度飛槳平臺陸續對外開源開放。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"瞄準AI規模化應用落地難題"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"小樣本和零樣本學習尋求突破,減輕數據標註依賴"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在經歷過熱潮與風口過後,AI 行業開始漸漸迴歸理性,如何解決落地難題、實現規模化盈利成爲AI公司最爲關注的問題,這也是廣爲外界所關切問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但AI商業化這條路並不好走,在AI落地的過程中,無論在技術上還是產業應用端,都面臨着不小的挑戰。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"技術方面,機器學習依賴大量的已標註數據,AI 技術在全場景的落地以及大數據時代的到來產生了海量、指數級別的數據,數據獲取也相對變得容易,然而,想要獲得大量的已標註數據卻並不容易,往往需要付出很大的人力、物力、財力成本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"小樣本學習被認爲是解決這一問題的關鍵,也被認爲是解決AI落地難題的“速效藥”。這幾年很多人工智能公司紛紛在小樣本學習領域發力,採用小樣本學習可以減少數據標註的工作量,降低模型訓練的成本和週期,從而解決人工智能在項目落地中對於大量標註數據的依賴。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"鵬城-百度·文心模型也在小樣本學習方面做出了許多技術突破。據悉,該模型在30餘項小樣本和零樣本任務上均取得了最優成績,能夠實現各類AI應用場景效果的提升,希望解決AI產業化規模應用的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目前,鵬城-百度·文心模型已在機器閱讀理解、文本分類、語義相似度計算等60多項任務中取得最好效果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/20\/35\/203f04aee7b3cd9cdee5fe4af93d0f35.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"鵬城-百度·文心小樣本學習效果"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/20\/35\/203f04aee7b3cd9cdee5fe4af93d0f35.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"鵬城-百度·文心零樣本學習效果"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"降低模型應用成本,首創大模型在線蒸餾技術"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一個技術挑戰是,大模型訓練、推理所消耗的資源極其昂貴和密集。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"儘管Paddle Serving已提供了超大模型的高速推理方案,但爲了進一步打造大模型的綠色落地方案,降低大模型應用成本,研究團隊提出了大模型在線蒸餾技術。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/e6\/3a\/e67758793035200c98fcb3a603ded23a.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"在線蒸餾技術"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"具體來說,該技術在鵬城-百度·文心學習的過程中,週期性地將知識信號傳遞給若干個學生模型同時訓練,從而在蒸餾階段一次性產出多種尺寸的學生模型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"相對傳統蒸餾技術,該技術極大節省了因大模型額外蒸餾計算以及多個學生的重複知識傳遞帶來的算力消耗。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這種新穎的蒸餾方式利用了鵬城-百度·文心規模優勢,在蒸餾完成後保證了學生模型的效果和尺寸豐富性,方便不同性能需求的應用場景使用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此外,研究團隊還發現,鵬城-百度·文心與學生模型尺寸差距千倍以上,模型蒸餾難度極大甚至失效。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對這個問題,研究團隊引入了助教模型進行蒸餾的技術,利用助教作爲知識傳遞的橋樑以縮短學生模型和鵬城-百度·文心 表達空間相距過大的問題,從而促進蒸餾效率的提升。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/19\/de\/1998c7f497be78b266f607dd6591bede.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"鵬城-百度·文心壓縮版模型效果"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據顯示,鵬城-百度·文心在線蒸餾方案的效果顯著,模型參數壓縮率可達99.98%。壓縮版模型僅保留0.02%參數規模就能與原有模型效果相當。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"相比直接訓練參數規模是自身2倍的BERT Base模型,鵬城-百度·文心在5項任務準確率上絕對提升了2.5%,而相對於同等規模的RoBERTa Base,準確率則絕對提升了3.4%,驗證了鵬城-百度·文心在線蒸餾方案的有效性。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"已在金融、工業等各行業應用"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目前,百度文心已大規模已應用於百度搜索、信息流、智能音箱等互聯網產品,同時通過百度智能雲賦能工業、能源、金融、通信、媒體、教育等各行各業。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如在金融領域,基於百度文心實現合同智能解析,能在1分鐘內完成對相關合同條款文本的解析識別,速度是之前的幾十倍,有效提升工作效率。文心還幫助百度智能雲提升了其智能客服的服務準確性,具體在運營商、銀行等企業的場景中應用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"百度表示,接下來,鵬城-百度·文心將進一步解決 AI 技術在應用中缺乏領域和場景化數據等關鍵難題,降低門檻,加快人工智能大規模產業應用。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章