繞過硬件瓶頸,成倍提升芯片算力,軟件層面深挖芯片性能可行嗎?

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果盤點近兩年的行業熱詞和社會熱詞排行榜,“芯片”一定榜上有名。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隨着AI 技術在各行各業的廣泛實踐,應用層對深度學習模型的通用性和複雜性要求越來越高。與之相應,深度學習對芯片算力的要求隨之增加。信息時代,處處都需要芯片,但是芯片卻屬於稀缺資源。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"業內的解決辦法有兩種,一種是定製芯片,一種是針對模型進行修改。通過使用小模型或者壓縮模型,降低到算力的要求。兩種方式各有優劣,定製芯片性能強悍但成本、週期、風險都很大;小模型或者壓縮模型成本較低、週期較短,但是會導致準確度下降,很難在高精度和高性能之間取得較好的平衡。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現有AI計算中的過多的冗餘計算和運行引擎的能力有限,制約了對芯片性能的挖掘。在芯片資源供需不平衡的情況下,目前主流的做法是攻堅生產力的難題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"也有技術團隊另闢蹊徑。一家叫做CoCoPIE 的 AI 公司,宣佈可以通過壓縮和編譯協同設計技術,從軟件層面挖掘現有芯片算力,有望讓現有芯片性能成倍提升。於是我們找到了CoCoPIE公司負責人李曉峯。據他介紹,目前CoCoPIE 已經搭建了 CoCo-Gen 和 CoCo-Tune 等產品。這些產品能夠在不額外增加人工智能專用硬件的情況下,讓現有處理器實時地處理人工智能應用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"他告訴InfoQ:“CoCoPIE 獨有的 AI 軟件技術棧,解決了端側AI發展和普及的瓶頸問題,這在業界目前還是獨一無二。測試數據和客戶反饋都表明,與其它方案的比較優勢十分明顯,有較大的機會在端側設備智慧化的浪潮中勝出。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"繞過硬件瓶頸,成倍提升芯片算力,軟件層面提升芯片性能是否可行?爲了進一步瞭解CoCoPIE 採用的技術,得到這個問題的答案,InfoQ 日前採訪了李曉峯。"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"從軟件層面榨出芯片算力"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:通過優化壓縮和編譯協同設計,解決性能問題,具體的技術實現和學術論文支持是什麼?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"李曉峯:CoCoPIE 的技術核心是創始團隊中的三位教授,他們都是天分很高又異常勤奮的人,在各自的領域都是佼佼者。其中王言治教授側重AI 模型算法,任彬教授側重AI模型編譯,慎熙鵬教授側重 AI 的系統引擎。這幾個研究領域在技術上是一個很好的互補,構成了 AI 計算優化技術的鐵三角,相互不可或缺,共同打造公司的核心競爭力,也算是一種天作之合。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"先介紹一下AI 模型優化執行的基本技術。一個 AI 任務在設備上進行運行,實際上就是把 AI 模型在映射爲芯片指令序列的過程。壓縮和編譯是執行的兩個關鍵步驟。先通過權重剪枝、權值量化的方式對模型進行結構層面的壓縮優化,減少模型本身的複雜度。再針對壓縮後的模型優化編譯,生成執行代碼。這樣一方面AI任務的執行效率更高,另一方面可以充分利用芯片能力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是壓縮和編譯這兩步,目前在業內並沒有做得非常好的。現有的技術要麼只能壓縮,要麼只能編譯,或者雖然兩者都有,但它們在設計上相互隔離,沒有很好的協同設計,所以很難達到既保證推理精度又保證運行效率的效果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CoCoPIE 技術的核心在於壓縮和編譯兩個步驟的“協同設計”,即在設計壓縮的時候考慮編譯器及硬件的偏好從而選擇壓縮的方式,在設計編譯器的時候利用壓縮模型的特點來設計相應的編譯優化方法。對應壓縮和編譯兩個步驟,我們爲 CoCoPIE 框架設計了兩個組件:CoCo-Gen 和 CoCo-Tune。CoCo-Gen 通過將基於模式的神經網絡剪枝與基於模式的代碼生成相協同,生成高效的執行代碼;CoCo-Tune 則能夠顯著縮短 DNN 模型壓縮及訓練的過程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CoCoPIE 的技術是通用的,可廣泛地應用於各種 CPU、GPU、DSP及AI專用芯片,如NPU、APU、TPU 等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CoCoPIE 在相關領域發表了大量的頂級國際會議論文,從上層 AI 應用優化技術,AI 模型設計技術,到編譯器優化技術,底層硬件相關優化技術。特別是 CoCoPIE 的技術介紹文章發表在今年6月份的 Communications of ACM 上,這是美國計算機學會的旗艦刊物,與今年的圖靈獎同期發佈,這說明學術界對 CoCoPIE 的工作的高度認可。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:目前的核心產品 CoCo-Gen 和 CoCo-Tune 可以單獨使用嗎?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"李曉峯:這兩個產品提供了我們AI 模型優化的關鍵技術,CoCo-Gen 通過將基於模式的神經網絡剪枝與基於模式的代碼生成相協同,生成高效的執行代碼;CoCo-Tune 則能夠顯著縮短 DNN 模型壓縮及訓練的過程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CoCo-Gen 和 CoCo-Tune 可以單獨使用。它們構成了 CoCoPIE 工具鏈的核心,所以優先推出。作爲連接上層AI任務和下層硬件的橋樑,CoCoPIE 的產品體系會不斷增添新成員。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:從軟件層面解決芯片荒問題,行業內是否有類似的軟件技術?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"李曉峯:目前的端側AI 技術棧,只有 CoCoPIE 的優化技術可以在主流芯片上達到或超過 AI 專用芯片的性能,這是通過大量實測驗證得到的結論。目前已知的技術,要麼側重壓縮,要麼側重編譯,沒有見到二者協同設計的技術,這是CoCoPIE 的專利技術。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲雖然當前主流芯片已經具有很好的潛力,但要發揮它們的這個潛力,必須通過壓縮和編譯的協同設計,通過精巧的算法,把AI 任務轉換爲合適的矢量計算,並很好地控制總體計算量。這個正是 CoCoPIE 的技術關鍵所在。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:尺有所短、寸有所長,這種技術當下的優勢和侷限是什麼?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"李曉峯:CoCoPIE 的優勢在於,一方面是使得大量原來在端側設備上無法正常運行的AI 任務也可以運行,另一方面原來在端側必須通過專用AI芯片才能運行的 AI 任務,現在通過主流芯片也可以運行。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"AI 任務的執行總是會受到芯片算力的制約,CoCoPIE 技術的能力總有自己的侷限,解放出來的 AI 算力也不是無限的。另外,CoCoPIE 技術目前側重的是 AI 推理任務,至於專門的 AI 訓練任務的加速不是我們的重點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:CoCoPIE 的技術能夠讓芯片算力提高 3-4 倍,讓芯片效能最高可提升 5-10 倍,衡量標準是什麼?對於不同芯片都能實現這種水平嗎?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"李曉峯:這些數據是實測出來,通過了同行評審,也通過了客戶的認定。也就是說,在技術上有理論支撐,在實踐上有產品落地。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"例如,用通用芯片和谷歌TPU-V2 的對比:使用 CoCoPIE,VGG-16 神經網絡在移動設備 Samsung Galaxy S10上比在 TPU-V2 上效能提升了近18倍,ResNet-50 則取得了4.7 倍的效能提升。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在同樣的Samsung Galaxy S10 平臺上,運行行爲識別的 C3D 和 S3D 兩個任務,CoCoPIE 的速度比 Pytorch Mobile分別提高了 17 倍和 22 倍。運行 MobileNetV3,CoCoPIE的速度比 TensorFlow Lite 和 Pytorch Mobile分別提升了近 3 倍和 4 倍。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外,對功耗測試(Qualcomm Trepn power profiler)的結果顯示,CoCoPIE 與 TVM相比,執行時間縮短了 9 倍以上,功率卻僅多消耗了不到 10%。在基於 AQFP 超導的DNN 推理加速方面的工作中,通過低溫測試驗證,我們的研究在所有硬件設備中也是迄今爲止能量效率最高的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:效能的提升不會憑空得來,這項軟件技術的運行對硬件環境有哪些要求?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"李曉峯:是的。CoCoPIE 技術對硬件環境的要求不高,主流芯片都可以滿足,具體來說就是芯片需要有矢量計算能力,比如ARM的NEON指令集,Intel的SSE、AVX指令集,RISC-V的向量擴展,等,都是當前CPU普遍存在的,GPU和APU\/NPU就更不用說了。當然如果沒有矢量計算能力,CoCoPIE 的技術仍然可以發揮作用,但是會受較大限制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:技術實踐過程中遇到的主要挑戰是什麼?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"李曉峯:CoCoPIE 的技術在實踐中遇到的主要挑戰是,我們目前的產品體系還不是很完善,而客戶的需求也是多種多樣,具體的服務方式千差萬別,因此目前我們還沒有進行大規模的商業推廣,主要是針對一些關鍵領域、關鍵客戶,比如選擇有代表性的主流芯片提供商、設備提供商、軟件服務提供商等,按照我們的產品發展策略有選擇地提供服務。我們會通過這個過程,與各種客戶需求進行磨合,不斷探索最佳的產品服務體系。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:這項技術目前是否有實際的落地案例?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"李曉峯:目前合作客戶已經有十幾個,這些客戶中有多個領域的,比如騰訊、滴滴、某著名芯片平臺提供商、某著名手機廠商、還有美國交通部、全球知名服務提供商Cognizant 等。"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"CoCoPIE 不是技術過渡期產品"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:主流處理器是實時人工智能的更優解,您認同這個觀點嗎?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"李曉峯:是的。對於端側設備來說,主流處理器是實時人工智能的更優解。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1、從功能上說,端側設備資源受限,應用場景千變萬化,而專用的AI處理器功能相對固化,應對端側的異常靈活的功能需求有較大的挑戰。主流處理器如果通過軟件技術已經可以處理AI問題了,當然就沒有必要再節外生枝。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2、從技術上說,爲解決AI問題的專用芯片的做法其實都是加大矢量計算的處理能力、提高內存訪問效率,有的稱爲張量計算單元。如前面所說,當前的主流芯片其實都已經有了矢量計算單元。這些矢量計算單元相比專用張量處理芯片雖然能力可能要弱一些,但執行當前的AI 任務一般也足夠了,前提是必須有優異的模型壓縮和編譯工具,能夠通過精巧的設計,把AI 任務轉換爲合適的矢量計算,並能很好地控制總體計算量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"3、從成本上說,主流處理器大規模量產,價格相對專用芯片要便宜很多,而且供貨渠道的選擇也更多。除了採購 AI 芯片本身的成本,還有一些隱性成本。多一個芯片會造成 PCB 、散熱等的重新設計,封裝也是額外成本。很多設備比如智能耳機、微型醫療設備等對這些因素很敏感。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"4、需要強調的是,CoCoPIE 的技術並不排斥專用 AI 芯片。作爲 AI 全棧的軟件優化技術,CoCoPIE 也支持 AI 處理器,讓 AI 處理器發揮出更大的效能。因此,我們也樂見 AI 處理器在特定的應用領域發揮重要作用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:CoCoPIE 是一個技術過渡期的產品嗎?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"李曉峯:恰恰相反,因爲端側AI 的泛在普及化纔剛剛開始,CoCoPIE 作爲該領域先進技術的引領者,我們認爲它的未來發展空間非常廣大。我們內部有一整套的產品發展戰略,未來的產品形態會和現在有所不同,但核心技術是一脈相承的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外假設未來有朝一日AI 專用芯片得到普及,也不會對 CoCoPIE 的生存空間帶來不利影響。首先 AI 專用芯片永遠不會比通用芯片更普及,對於通用芯片已經能做好的工作,可能還是會在通用芯片上做更有效、也更靈活;其次, AI 芯片即使發展了,也還是離不開編譯優化技術。我們的技術會讓 AI 芯片的能力進一步提升。其實通用芯片也一樣,比如 CPU 或 GPU ,不論多便宜、性能多高,仍然需要高性能的編譯器支持,比如 LLVM 或 NVCC 等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"長期來看,AI 技術棧的發展對 CoCoPIE 軟件技術的需求只會越來越大,就像手機 SoC芯片隨着時間的發展,功能不是越來越簡單了,而是越來越強大了,8 核手機很常見,對軟件技術的要求也越來越高。事實上,AI計算對算力的要求遠遠超過AI硬件能力的發展速度。據美國MIT大學的研究報告,近年來AI計算的算力需求發展是每兩年700倍,這個發展速度只通過硬件能力的提升是根本無法滿足要求的,必須在軟件技術上有所突破。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:您怎麼看待如今熱門的大模型技術?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"李曉峯:大模型在訓練數據足夠多的情況下,往往能取得更高的AI 能力。這個是人類對未知世界探索的必經之路。這個就像在高能物理學界,人們爲了做出新的發現,不斷建設能量更高的粒子對撞機。但是大模型這個事情也需要從兩方面看,如果一味地追求更大的模型,所需要的訓練數據量、訓練時間、算力支持、能量消耗等等都不斷提高,邊際效益會越來越小,這個趨勢顯然不可持續。可能未來只會在個別重大挑戰的工作上持續這種增大模型的做法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏給一個關於大模型收益方面的數字,ResNet 是2015 年發佈的一個著名計算機視覺模型。該模型的改進版本稱爲 ResNeXt,於 2017 年問世。與 ResNet 相比,ResNeXt 所需的計算資源要多 35%(以總浮點運算來衡量),準確度卻只提高了 0.5%。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"再給一個關於碳排放方面的數字,據福布斯雜誌去年的一篇報道,自從深度學習在2012 年開始大發展後,產生一流的人工智能模型所需的計算資源,平均每 3.4 個月翻一番;這意味着,訓練 AI 模型所需的能量從 2012 年到 2018 年就已經增加了 30 萬倍。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果做一個對比,深度學習的能力即使和一個嬰兒相比,在很多方面還仍有很大差距,更別說與成人大腦相比了。而我們成人的大腦運轉,也只需要20 瓦左右的能量,這隻能夠給一個燈泡供電。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們顯然不可能只是通過擴大模型來提高機器的智能,而學界也在不斷探索新的方法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q:可否談談如今正在上浮的基礎軟件行業?簡單聊聊您對芯片行業的判斷?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"李曉峯:基礎軟件的重要性越來越大。這有兩方面的原因,一個是近年來技術發展很快,對基礎軟件有實際需求,提出了必要性;另一個是過去已經培養了大量高質量的工程師,提供了可能性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"芯片行業還會繼續蓬勃發展。科技發展趨勢就是不斷地將數字世界滲透到物理世界的各個方面,而數字化的根本體現就是芯片在各種設備的不斷植入。上一波設備智能化的核心手段是在設備上植入芯片、能跑應用,而這一波智能化的核心手段則是設備上能跑深度神經網絡,這是浩浩蕩蕩的發展大勢。這也是CoCoPIE 的根本機會所在。"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"寫在最後"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"李曉峯告訴InfoQ, CoCoPIE 的技術領先優勢至少有幾年的時間,足夠在計算機領域爭得一席之地。CoCoPIE 的技術並非是爲了解決芯片荒的問題,而是爲了實現AI任務的普及化,遇到芯片荒只是恰巧的事情,這算是 CoCoPIE 技術能力的副產品。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在李曉峯看來,面對如今複雜多樣的場景和終端,現有技術水平無法完全發揮主流芯片的能力,所以纔有了 CoCoPIE 的發展空間。可以確認的是,CoCoPIE 的發展爲將芯片能力“物盡其用”,提供了一種新思路。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章