現實世界中的AI芯片:互操作性、約束、成本、能效和模型

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"本文最初發表於 ZDNet,經原作者 George Anadiotis 授權,InfoQ 中文站翻譯並分享。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"怎樣才能更好地利用人工智能的硬件,這一問題的答案可能不僅僅在於硬件,而且主要在於硬件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"假如我們要給這個問題貼上一個價格標籤,它將會是數十億美元的市場。這些都是它對不同市場的綜合估價。支持這些應用的專業硬件隨着人工智能應用的爆炸式增長而爆發。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對我們來說,對所謂的人工智能芯片的興趣是來自於我們對人工智能的興趣的一個分支,我們一直試圖跟上這個領域的發展。對於 Determined AI 首席執行官兼創始人 Even Sparks 來說,這是個更深層次的問題。本文就硬件和模型在人工智能中的相互作用進行了一次訪談。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"不同硬件堆棧的互操作性層"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在創立 Determined AI 之前,Sparks 是"},{"type":"link","attrs":{"href":"https:\/\/amplab.cs.berkeley.edu\/","title":"","type":null},"content":[{"type":"text","text":"加州大學伯克利分校 AmpLab"}]},{"type":"text","text":"的一名研究員。他致力於大規模機器學習的分佈式系統,正是在這一點上,他獲得了與計算機科學先驅,"},{"type":"link","attrs":{"href":"https:\/\/riscv.org\/","title":"","type":null},"content":[{"type":"text","text":"RISC-V 基金會"}]},{"type":"text","text":"現任董事會副主席[Dave Patterson](https:\/\/en.wikipedia.org\/wiki\/David_Patterson_(computer_scientist)和其他人合作的機會。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"正如 Sparks 所說,Patterson 在早期就在鼓吹摩爾定律已死,而定製硅則是這個領域持續發展的唯一希望。Sparks 受到了影響,他和 Determined AI 所做的工作就是開發幫助數據科學家和機器學習工程師的軟件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個軟件的目的是幫助數據科學家和機器學習工程師加速工作負載和工作流,更快地構建人工智能應用。爲了達到這個目的,Determined AI 提供了一個軟件基礎設施層,該層位於 TensorFlow 或"},{"type":"link","attrs":{"href":"https:\/\/www.zdnet.com\/article\/opinionated-openness-facebook-ai-research-strategy-ecosystem-and-target-audience-for-deep-learning\/","title":"","type":null},"content":[{"type":"text","text":"PyTorch"}]},{"type":"text","text":"等框架之下,並在各種芯片和加速器之上。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"處在這樣的位置上,Sparks 的興趣並不在於剖析供應商的策略,而是要站在開發和部署機器學習模型的角度考慮問題。所以,"},{"type":"link","attrs":{"href":"https:\/\/onnx.ai\/","title":"","type":null},"content":[{"type":"text","text":"ONNX"}]},{"type":"text","text":"是一個天然的起點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/2e\/f2\/2e4ff9e3c72f59cae5058c5428c9dcf2.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}}],"text":"ONNX 是一個互操作性層,允許使用不同框架訓練的機器學習模型能夠部署到一系列人工智能芯片上。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ONNX 是一個互操作性層,允許在一系列支持 ONNX 的人工智能芯片上使用不同框架訓練的機器學習模型進行部署。我們已經看到,像"},{"type":"link","attrs":{"href":"https:\/\/www.zdnet.com\/article\/open-source-ai-chips-making-green-waves-bringing-energy-efficiency-to-iot-architecture\/","title":"","type":null},"content":[{"type":"text","text":"GreenWaves"}]},{"type":"text","text":"或"},{"type":"link","attrs":{"href":"https:\/\/www.zdnet.com\/article\/trailblaizing-end-to-end-ai-application-development-for-the-edge-blaize-releases-ai-studio\/","title":"","type":null},"content":[{"type":"text","text":"Blaize"}]},{"type":"text","text":"這樣的供應商已經開始支持 ONNX 了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ONNX 最初出自 Facebook,Sparks 指出,之所以開發 ONNX,是因爲 Facebook 有一套完全不同的訓練和推理系統來處理機器學習應用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Facebook 內部使用 PyTorch 進行開發,而生產環境中運行的深度學習模大多是 Caffe 支持的計算機視覺模型。Facebook 的任務是,可以使用任何語言進行研究,但生產部署必須在 Caffe 中進行。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這就需要一箇中間層,用於在 PyTorch 中輸出的模型架構和輸入 Caffe 的模型架構之間進行轉換。很快,人們就意識到,這是一個應用範圍更廣的好主意。事實上,它與我們以前在編程語言編譯器中所看到的並沒有什麼不同。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"ONNX 和 TVM: 解決類似問題的兩種方法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個想法是利用多種高級語言之間的中間表示,並將多種語言插入源語言和目標框架。這個主意聽起來很像編譯器,也是個好主意。但是,ONNX 並非人工智能芯片互操作的最終目的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/tvm.apache.org\/","title":"","type":null},"content":[{"type":"text","text":"TVM"}]},{"type":"text","text":"是一個“新手”。TVM 最初是華盛頓大學的一個研究項目,最近"},{"type":"link","attrs":{"href":"https:\/\/thenewstack.io\/apache-tvm-portable-machine-learning-across-backends\/","title":"","type":null},"content":[{"type":"text","text":"它成爲 Apache 的頂級開源項目"}]},{"type":"text","text":",而且它在"},{"type":"link","attrs":{"href":"https:\/\/octoml.ai\/","title":"","type":null},"content":[{"type":"text","text":"OctoML"}]},{"type":"text","text":"中也有商業上的努力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"TVM 的目標與 ONNX 類似:能夠將深度學習模型編譯成它們所謂的最小可部署模塊,然後對這些模型自動地針對不同的目標硬件進行優化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Sparks 指出,TVM 是一個相對較新的項目,但是它背後有一個相當強大的開源社區。他接着補充說,很多人希望 TVM 成爲一個標準:“在 Nvidia 中沒有提及的硬件供應商可能想要更開放、更容易進入市場。而且他們想找一個狹窄的接口來實現。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 ONNX 和 TVM 之間的精確定位上存在着細微的差別,在這方面,我們還得聽聽 Sparks 說的。簡而言之,Sparks 說,TVM 比 ONNX 的級別要低一些,而且還存在一些與之相關的折衷方案。他認爲 TVM 有可能更具通用性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然而, Sparks 指出, ONNX 和 TVM 都處於早期階段,它們會隨着時間的推移相互學習。對 Sparks 而言,他們並非直接的競爭者,而是兩種解決類似問題的方法。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"人工智能的限制、成本和能源效率"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是,不管是 ONNX 還是 TVM,處理這一互操作性層,數據科學家和機器學習工程師都不應該有這樣的任務。Sparks 提倡在模型開發的不同階段分離關注點,這非常符合"},{"type":"link","attrs":{"href":"https:\/\/www.zdnet.com\/article\/2021-technology-trend-review-part-two-artificial-intelligence-knowledge-graphs-and-the-covid-19-effect\/","title":"","type":null},"content":[{"type":"text","text":"MLOps 的主題"}]},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了準備數據訓練,有許多系統可以實現高性能和緊湊的數據結構,等等。它是流程的不同階段,不同於模型訓練與開發的實驗工作流程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當你在開發模型時,只要你獲取的數據格式正確,你就可以對上游數據系統進行操作。同樣的,只要你使用這些高級語言開發,你所使用的訓練硬件,是 GPU 還是 CPU,還是外置加速器,都無關緊要。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/49\/eb\/4980d7794f04872497915c38dcbba3eb.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}}],"text":"Determined AI 的堆棧旨在抽象不同的底層硬件架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"正如 Sparks 所說,關鍵在於硬件如何滿足應用的約束。想像一家擁有傳統硬件的醫療設備公司在現場。它們不會升級只是爲了運行稍微精確一些的模型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然而,問題卻完全相反:如何獲得能在特定硬件上運行的最精確模型。所以它們可能會從一個龐大的模型開始,使用諸如量化和蒸餾等技術來適應硬件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這指的是部署 \/ 推理,但是同樣的邏輯也適用於訓練。包括經濟成本和環境成本在內的訓練人工智能模型的成本很難忽略。Sparks"},{"type":"link","attrs":{"href":"https:\/\/openai.com\/blog\/ai-and-compute\/","title":"","type":null},"content":[{"type":"text","text":"提到了 OpenAI 的工作,在過去的幾年裏,訓練成本上升了 30 萬倍"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不過,那是兩年前的事了。正如最近來自"},{"type":"link","attrs":{"href":"https:\/\/www.technologyreview.com\/2020\/12\/04\/1013294\/google-ai-ethics-research-paper-forced-out-timnit-gebru\/","title":"","type":null},"content":[{"type":"text","text":"谷歌倫理人工智能團隊前聯合負責人所做的工作表明"}]},{"type":"text","text":",這一趨勢絲毫沒有放緩。使用最新的 OpenAI 語言模型 GPT3 的訓練費用估計在 700 萬到 1200 萬美元之間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Sparks 指出了顯而易見的一點:這是一個瘋狂的計算量、能量和金錢,而大多數凡人都沒有這些條件。Sparks 正忙着開發工具,我們需要能夠幫助計算成本並分配配額的工具。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"在模型中輸入知識"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Determined AI 的技術提供了一種方法,可以指定預算、需要進行收斂訓練的模型數量,以及探索模型的空間。在收斂之前停止訓練,用戶無需額外費用就可以研究模型。這是一種基於主動學習的方法,但是還有其他諸如蒸餾、微調或遷移學習的方法:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這些世界巨頭,Facebook 和 Google 都在使用數十億個參數對海量數據進行大規模訓練,並在一個問題上花費數百年的 GPU 時間。那麼,你不需要從頭開始,而是拿來這些模型,或者用它們來形成("},{"type":"link","attrs":{"href":"https:\/\/www.zdnet.com\/article\/pinecone-a-serverless-vector-database-for-machine-learning-leaves-stealth-with-10m-funding\/","title":"","type":null},"content":[{"type":"text","text":"你要用於下游任務的嵌入"}]},{"type":"text","text":")。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Sparks 提到了自然語言處理和圖像識別,"},{"type":"link","attrs":{"href":"https:\/\/www.zdnet.com\/article\/google-searchs-ability-to-understand-you-just-made-its-biggest-leap-in-5-years\/","title":"","type":null},"content":[{"type":"text","text":"BERT"}]},{"type":"text","text":"和"},{"type":"link","attrs":{"href":"https:\/\/www.zdnet.com\/article\/googles-distributed-computing-for-dummies-trains-restnet-50-in-half-an-hour\/","title":"","type":null},"content":[{"type":"text","text":"ResNet-50"}]},{"type":"text","text":"就是很好的例子。儘管如此,他還是發出了警告:這並不總是有效的。這一問題變得棘手的地方在於,當人們所訓練的數據模式與現有數據完全不同時。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是,也許還有別的辦法。不管我們稱它爲"},{"type":"link","attrs":{"href":"https:\/\/www.zdnet.com\/article\/the-next-decade-in-ai-gary-marcus-four-steps-towards-robust-artificial-intelligence\/","title":"","type":null},"content":[{"type":"text","text":"健壯人工智能"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/www.zdnet.com\/article\/hybrid-ai-through-data-space-time-and-industrial-applications-beyond-limits-scores-113m-series-c-to-scale-up\/","title":"","type":null},"content":[{"type":"text","text":"混合人工智能"}]},{"type":"text","text":"、"},{"type":"link","attrs":{"href":"https:\/\/knowablemagazine.org\/article\/technology\/2020\/what-is-neurosymbolic-ai","title":"","type":null},"content":[{"type":"text","text":"神經符號人工智能"}]},{"type":"text","text":",還是其他什麼名字,將知識注入到機器學習模型中是否有用?Sparks 的回答是肯定的:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"“商品”用例,如自然語言處理或視覺,在這些用例中有人們一致認可的基準和標準數據集。人人都知道問題所在,圖像分類,目標檢測,語言翻譯。但是,隨着專業水平的提高,我們看到的一些最大的進步就是,你找到了一位領域專家,將他們的知識注入其中。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Sparks 以物理現象爲例。比方說,你建立了一個包含 100 個參數的前饋神經網絡,讓它預測一個飛行物在一秒鐘內的位置。如果給予足夠多的例子,系統就會收斂到對所關注的函數的合理、良好的近似值,並能以較高的正確度進行預測:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但如果你將更多的物理世界知識注入到應用程序中,那麼數據量就會大大減少,而且正確度也會大大提高,我們將看到一些引力常數開始出現,可能是網絡特徵,也可能是某些特徵的組合。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"神經網絡非常好。他們的函數逼近能力非常強。但是如果我把這個函數的一些信息告訴計算機,希望能給大家省下幾百萬的計算費用,得到一個更精確的模型來顯示這個世界。放棄這種想法是不負責任的。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"作者介紹:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"George Anadiotis,特約撰稿人。熟悉技術、數據和媒體。現爲 Gigaome 分析師,爲財富 500 強、初創公司和非政府組織提供諮詢服務、建立和管理各種規模的項目、產品和團隊。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"size","attrs":{"size":10}},{"type":"strong"}],"text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"size","attrs":{"size":10}}],"text":"https:\/\/www.zdnet.com\/article\/ai-chips-in-the-real-world-interoperability-constraints-cost-energy-efficiency-and-models\/#ftag=RSSbaffb68"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章