OpenAI發佈Triton,一款用於神經網絡的類Python GPU開源編程語言

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"近期,OpenAI發佈了他們的最新語言Triton。這種開源編程語言讓研究人員能夠爲AI負載編寫高效的GPU代碼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/01\/01\/0129c2d4a69ab79476bb7d175016b401.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"它與Python兼容,並且用戶只需編寫最少25行代碼,就能實現專家級的效果。OpenAI聲稱這款語言讓開發人員無需太多努力即可挖掘硬件的最大潛能,從而比以往更輕鬆地創建更復雜的工作流程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/fb\/57\/fb37dda60ac7f38f44d885edd47c5057.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"link","attrs":{"href":"http:\/\/www.eecs.harvard.edu\/~htk\/publication\/2019-mapl-tillet-kung-cox.pdf","title":"","type":null},"content":[{"type":"text","text":"http:\/\/www.eecs.harvard.edu\/~htk\/publication\/2019-mapl-tillet-kung-cox.pdf"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"深度學習領域的研究人員通常依賴於原生框架操作符。然而這可能會帶來一些問題,因爲它需要許多臨時張量才能工作,這可能會影響大規模神經網絡的性能發揮。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"編寫專門的GPU內核是一種更便利的解決方案,但由於對GPU編程的複雜性,這種方案實踐起來會有意想不到的困難。找到一種既能提供所需的靈活性和速度,又能讓開發人員輕鬆理解的系統是一項挑戰。這促使OpenAI的研究人員改進了Triton,Triton最初是由他們的一位隊友創建的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現代GPU的架構可以分解爲三大組件——DRAM、SRAM和ALU。在優化CUDA代碼時必須考慮每一個組件;開發人員不能忽視GPU編程帶來的諸多挑戰,包括:來自DRAM的內存傳輸應該充分合並,以利用當今內存接口上更大的總線寬度。數據在再次使用之前需要手動存儲在SRAM中,以免在檢索時與其他共享內存塊發生衝突。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/c6\/a8\/c6f6762d41698de36c366ec3999eb3a8.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/openai.com\/blog\/triton\/","title":"","type":null},"content":[{"type":"text","text":"https:\/\/openai.com\/blog\/triton\/"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Triton簡化了專用內核的開發過程,這些內核比通用庫中的內核要快得多。編譯器會自動對其進行優化和並行化,將其轉換爲在最新的Nvidia GPU上執行的代碼。Triton起源於2019年提交給機器學習和編程語言國際研討會的一篇"},{"type":"link","attrs":{"href":"http:\/\/www.eecs.harvard.edu\/~htk\/publication\/2019-mapl-tillet-kung-cox.pdf","title":"","type":null},"content":[{"type":"text","text":"論文"}]},{"type":"text","text":",其創建者現在是OpenAI團隊的一員。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"論文:"},{"type":"link","attrs":{"href":"http:\/\/www.eecs.harvard.edu\/~htk\/publication\/2019-mapl-tillet-kung-cox.pdf","title":"","type":null},"content":[{"type":"text","text":"http:\/\/www.eecs.harvard.edu\/~htk\/publication\/2019-mapl-tillet-kung-cox.pdf"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Github:"},{"type":"link","attrs":{"href":"https:\/\/github.com\/openai\/triton","title":"","type":null},"content":[{"type":"text","text":"https:\/\/github.com\/openai\/triton"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"來源:"},{"type":"link","attrs":{"href":"https:\/\/openai.com\/blog\/triton\/","title":"","type":null},"content":[{"type":"text","text":"https:\/\/openai.com\/blog\/triton\/"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.marktechpost.com\/2021\/07\/28\/openai-releases-triton-an-open-source-python-like-gpu-programming-language-for-neural-networks","title":"","type":null},"content":[{"type":"text","text":"https:\/\/www.marktechpost.com\/2021\/07\/28\/openai-releases-triton-an-open-source-python-like-gpu-programming-language-for-neural-networks"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章