OpenAI发布Triton,一款用于神经网络的类Python GPU开源编程语言

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"近期,OpenAI发布了他们的最新语言Triton。这种开源编程语言让研究人员能够为AI负载编写高效的GPU代码。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/01\/01\/0129c2d4a69ab79476bb7d175016b401.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"它与Python兼容,并且用户只需编写最少25行代码,就能实现专家级的效果。OpenAI声称这款语言让开发人员无需太多努力即可挖掘硬件的最大潜能,从而比以往更轻松地创建更复杂的工作流程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/fb\/57\/fb37dda60ac7f38f44d885edd47c5057.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"link","attrs":{"href":"http:\/\/www.eecs.harvard.edu\/~htk\/publication\/2019-mapl-tillet-kung-cox.pdf","title":"","type":null},"content":[{"type":"text","text":"http:\/\/www.eecs.harvard.edu\/~htk\/publication\/2019-mapl-tillet-kung-cox.pdf"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"深度学习领域的研究人员通常依赖于原生框架操作符。然而这可能会带来一些问题,因为它需要许多临时张量才能工作,这可能会影响大规模神经网络的性能发挥。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"编写专门的GPU内核是一种更便利的解决方案,但由于对GPU编程的复杂性,这种方案实践起来会有意想不到的困难。找到一种既能提供所需的灵活性和速度,又能让开发人员轻松理解的系统是一项挑战。这促使OpenAI的研究人员改进了Triton,Triton最初是由他们的一位队友创建的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"现代GPU的架构可以分解为三大组件——DRAM、SRAM和ALU。在优化CUDA代码时必须考虑每一个组件;开发人员不能忽视GPU编程带来的诸多挑战,包括:来自DRAM的内存传输应该充分合并,以利用当今内存接口上更大的总线宽度。数据在再次使用之前需要手动存储在SRAM中,以免在检索时与其他共享内存块发生冲突。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/resource\/image\/c6\/a8\/c6f6762d41698de36c366ec3999eb3a8.jpg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/openai.com\/blog\/triton\/","title":"","type":null},"content":[{"type":"text","text":"https:\/\/openai.com\/blog\/triton\/"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Triton简化了专用内核的开发过程,这些内核比通用库中的内核要快得多。编译器会自动对其进行优化和并行化,将其转换为在最新的Nvidia GPU上执行的代码。Triton起源于2019年提交给机器学习和编程语言国际研讨会的一篇"},{"type":"link","attrs":{"href":"http:\/\/www.eecs.harvard.edu\/~htk\/publication\/2019-mapl-tillet-kung-cox.pdf","title":"","type":null},"content":[{"type":"text","text":"论文"}]},{"type":"text","text":",其创建者现在是OpenAI团队的一员。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"论文:"},{"type":"link","attrs":{"href":"http:\/\/www.eecs.harvard.edu\/~htk\/publication\/2019-mapl-tillet-kung-cox.pdf","title":"","type":null},"content":[{"type":"text","text":"http:\/\/www.eecs.harvard.edu\/~htk\/publication\/2019-mapl-tillet-kung-cox.pdf"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Github:"},{"type":"link","attrs":{"href":"https:\/\/github.com\/openai\/triton","title":"","type":null},"content":[{"type":"text","text":"https:\/\/github.com\/openai\/triton"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"来源:"},{"type":"link","attrs":{"href":"https:\/\/openai.com\/blog\/triton\/","title":"","type":null},"content":[{"type":"text","text":"https:\/\/openai.com\/blog\/triton\/"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文链接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.marktechpost.com\/2021\/07\/28\/openai-releases-triton-an-open-source-python-like-gpu-programming-language-for-neural-networks","title":"","type":null},"content":[{"type":"text","text":"https:\/\/www.marktechpost.com\/2021\/07\/28\/openai-releases-triton-an-open-source-python-like-gpu-programming-language-for-neural-networks"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章