莱斯大学和英特尔的新研究:训练深度神经网络,CPU 可以比 GPU 更快

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"莱斯大学(Rice University)的计算机科学家展示了一种在普通处理器上运行的人工智能软件,它训练深度神经网络的速度是基于图形处理器的平台的 15 倍。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"莱斯大学布朗工程学院计算机科学助理教授 Anshumali Shrivastava 表示:“训练成本是人工智能的真正瓶颈,企业每星期都要花上数百万美元,仅仅是为了训练和微调他们的人工智能工作负载。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Shrivastava 和来自莱斯大学与英特尔的合作者在 4 月 8 日的机器学习系统会议 "},{"type":"link","attrs":{"href":"https:\/\/mlsys.org\/","title":null,"type":null},"content":[{"type":"text","text":"MLSys"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 上展示了解决这一瓶颈的研究成果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"深度神经网络是人工智能的一种强大形式,在某些任务上超越了人类。对于深度神经网络的训练通常是一系列矩阵乘法运算,而矩阵乘法运算是图形处理单元(GPU)的理想工作负载,其成本约为通用中央处理单元(CPU)的三倍。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Shrivastava 说:“整个行业都集中在一项改进上:更快的矩阵乘法。所有人都在寻找专门的硬件和架构来推进矩阵乘法。如今,甚至有人说要为特定种类的深度学习提供专用的软硬件组合。与其把整个系统优化的世界都抛到昂贵的算法面前,我还不如这么说:‘让我们重新审视一下算法。’”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Shrivastava 的实验室在 2019 年完成了这项工作,他们将深度神经网络的训练重铸为一个搜索问题,并使用哈希表解决。他们的“次线性深度学习引擎”(sub-linear deep learning engine,SLIDE)是专门为使用普通 CPU 而设计的,由 Shrivastava 和来自英特尔的合作者"},{"type":"link","attrs":{"href":"https:\/\/techxplore.com\/news\/2020-03-deep-rethink-major-obstacle-ai.html","title":null,"type":null},"content":[{"type":"text","text":"在 MLSys 2020 上发布"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",证明了它的性能能够超越基于 GPU 的训练。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/proceedings.mlsys.org\/paper\/2021\/file\/3636638817772e42b59d74cff571fbb3-Paper.pdf","title":null,"type":null},"content":[{"type":"text","text":"不久前,他们在 MLSys 2021 上发表了一项研究"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",探索了在现代 CPU 中使用向量化和内存优化加速器是否可以提高 SLIDE 的性能。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"该研究报告的共同作者,莱斯大学的研究生 Shabnam Daghaghi 说:“基于哈希表的加速性能已经超越了 GPU,但 CPU 也在不断发展,”。他说,“我们利用这些创新让 SLIDE 更进一步,表明如果你不坚持矩阵乘法,你可以利用现代 CPU 的能力,训练人工智能模型的速度比最好的专业硬件替代方案快 4 到 15 倍。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"研究报告的作者之一、莱斯大学本科生 Nicholas Meisburger 称:“CPU 仍然是计算领域最普遍的硬件。在人工智能工作负载中,让它们更有吸引力的好处是不可低估的。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"原文链接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"https:\/\/techxplore.com\/news\/2021-04-rice-intel-optimize-ai-commodity.html"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章