K8s 为 AI 应用提供大规模 GPU 算力之实践 | QCon

原創

QCon全球软件开发大会

2020-10-02 00:03

华为云 CCI 服务管理数百个 GPU 卡，为华为云 EI 服务及外部客户提供 AI 计算平台，在该过程中积累了大量面向 AI 计算的优化经验。AI 计算加速的关键是 GPU 管理，K8S 资源调度优化，面向 AI 计算框架和模型的 Job/Task 调度。通过这些优化手段可以使得 128 块 GPU 卡的线性加速比达到 0.8+。本次议题将介绍如何通过开源项目 K8S + Kata 容器搭建 AI 计算平台，更大化 GPU 及 AI 芯片算力的使用效率，并给出测试结果。最后我们也会对未来的技术改进做出展望。

听众受益

了解基于 K8S 的 AI 框架的现状；
了解大规模 GPU 在 AI 分布式训练场景下的应用；
了解 K8S 在人工智能场景下的优化思路。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

相關文章

谷歌发布生态系统RLDS，可在强化学习中生成、共享和使用数据集

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-12-20 10:53:54

从前端到全栈 -- 最全面向对象总结

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragr

程序员海军

2021-12-21 10:54:01

一场数据架构变革正在来临

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null

2021-12-21 10:54:01

BPF 和 Go: Linux 中的现代内省形式

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null

2021-12-20 11:08:55

从混合包开发到100%纯鸿蒙应用还有多远？优酷鸿蒙版的开发实践与思考｜卓越技术团队访谈录

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-12-19 12:03:53

解读数字化转型下的数据安全：AI正在开辟新的可能性

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-12-19 14:03:54

Facebook 如何做大规模服务的自主测试

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragr

2021-12-21 10:54:01

一个避免技术债的无代码微服务应用商店

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null

2021-12-20 10:53:54

改善十年应用的部署体验

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-12-21 11:13:52

智慧家庭场景的推荐系统的发展历程和方向 | InfoQ《公开课》

直播概要：隨着計算機的蓬勃發展，互聯網進入大數據和人工智能時代，爲了解決信息過載和長尾商品，推薦系統成爲唯一選擇，而面對不同的業務場景，爲了解決業務痛點，會根據不同的場景特點尋找不同的方法和手段來解決推薦中實際遇到的問題。在智慧家庭領域，

InfoQ 中文站

2021-12-21 10:54:01

跨语言的多模态、多任务检索模型MURAL解读

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-12-21 10:54:01

Log4j2 维护者：没工资还挨骂；阿里每周可选一天灵活办公；亚马逊 CTO 预测2022年五大技术趋势；苹果正式推出“数字遗产”...

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-12-21 10:53:51

一篇带你用 VuePress + Github Pages 搭建博客

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"前言","attrs

2021-12-21 10:53:51

【HZERO微服务平台3】源码分析之oauth服务token生成、校验、获取信息、传递

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"headin

2021-12-20 11:08:55

程序员如何建立第二大脑

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"typ

2021-12-20 10:43:54

24小時熱門文章

最新文章

最新評論文章