後Hadoop時代,愛奇藝如何有效整合大數據和AI平臺?

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"採訪嘉賓 | 劉騁昺"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"編輯 | Tina"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"大數據是人工智能的基礎。從大數據到數據分析再到 AI 應用的轉變,這也是一個很自然的發展過程。但是隻有在數據、算法、企業的場景應用三者緊密配合的前提下,纔可以有效地提升整個 AI 業務的流程。因此,愛奇藝在原來的數據積累基礎上,進一步的完善了技術平臺,形成了大數據+AI 的統一架構,同時兼顧了數據、算法訓練、人力物力算力等多方面的因素。那麼愛奇藝在探索和實踐過程中,有哪些沉澱出的經驗可以分享給大家?InfoQ採訪了愛奇藝大數據計算團隊負責人劉騁昺,得到了一個初步的瞭解。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"劉騁昺將在2021年11月5-6日全球人工智能與機器學習技術大會(北京站)2021上進行主題爲《"},{"type":"link","attrs":{"href":"https:\/\/aicon.infoq.cn\/2021\/beijing\/presentation\/3720","title":null,"type":null},"content":[{"type":"text","text":"愛奇藝 Bigdata+AI 統一架構探索與實踐"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"》的演講,更多內容可以通過觀看演講進行了解。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"嘉賓簡介:劉騁昺"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",畢業於上海交通大學計算機系,2014 年加入愛奇藝,先後負責 Hadoop 運維和研發,計算引擎和平臺的設計和開發工作,對大數據服務的底層優化和平臺建設有豐富經驗。目前是大數據計算團隊負責人,負責Spark\/Flink計算引擎、離線工作流、實時計算、實時分析、機器學習平臺等相關工作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"InfoQ:您們選擇On-Prem 還是 Cloud 來實現大數據+AI平臺?爲什麼?您們是如何做決策的?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"劉騁昺:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"目前我們採用On-Prem和Cloud的混合雲部署模式,以私有云部署爲主體,在部分業務探索引入公有云服務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"經過初步探索,我們發現公有云和私有云各有優劣,且能相互補充。公有云的優勢在於按量付費,對於探索性的業務(如不確定使用什麼硬件最合適),公有云的試錯成本較低;對於峯谷效應明顯的業務,公有云的自動擴縮容能力也能夠幫助我們降低成本。私有云在運維支持端、穩定且高負載場景的成本端的表現更好。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在我們看來,我們採用的混合雲部署模式主要有兩方面的好處。一方面,通過搭建統一的服務管理平臺,對用戶屏蔽底層使用的私有云或公有云資源,降低業務接入與切換的難度;另一方面,利用私有云部署,獲得對雲廠商的議價能力,同時保持對公有云動態的及時跟進,不斷審視和改進私有云的服務能力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"InfoQ:在之前的未曾改造的愛奇藝大數據平臺上運行機器學習任務,存在哪些挑戰?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"劉騁昺:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"機器學習與大數據平臺的結合,我們主要討論特徵數據處理與模型訓練兩方面。傳統的大數據平臺一般以Hadoop(HDFS+YARN)爲基礎,運行MapReduce、Hive、Spark、Flink等計算框架。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在特徵數據處理方面,我們最常用的是Hive和Spark,要把計算任務跑起來難度不大,主要的挑戰在於工程效率與大數據量下的性能表現。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"相比而言,在模型訓練方面的挑戰更大,主要體現在:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"1)框架支持:需要在Hadoop上支持分佈式地運行機器學習框架(如TensorFlow、PyTorch等);"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"2)資源調度:機器學習任務單進程的CPU、內存資源佔用經常較大,且不同進程的資源需求不同,需要考慮這些在Hadoop集羣上如何分佈才能最大化資源利用率;另外,有的模型訓練需要用到GPU,YARN從3.0版本開始加入了對GPU的支持,在後續版本逐步完善;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"3)Docker支持:機器學習任務對環境依賴較多且各不同,因此加入對Docker的支持就顯得十分必要,而老版本的Hadoop集羣對Docker的支持比較初級,所以也需要對Hadoop集羣做版本升級。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"InfoQ:大數據+AI平臺的“整合”,關鍵要解決的核心問題是什麼?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"劉騁昺:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"核心問題是數據和計算的整合。傳統的機器學習把數據放在一臺獨立的機器上,僅利用單機的計算資源進行模型訓練,如此一來,大數據和AI成了兩個完全獨立的系統。而只有充分利用大數據平臺豐富的存儲和計算資源,才能充分發揮AI的威力。因此,整合的核心問題是把AI相關的數據接入大數據平臺,並利用大數據平臺的計算資源運行分佈式的模型訓練,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3a3a3a","name":"user"}}],"text":"將特徵生產、樣本生產、模型訓練、模型管理打通"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"有了數據和計算的整合,下一步是元數據的統一管理,可以幫助我們解決煙囪式開發的問題,節省開發人力和計算資源,提高數據和模型質量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"InfoQ:架構上是如何實現存儲計算分離的?您們是如何兼顧存儲和計算的效率?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"劉騁昺:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們通過自研QBFS(愛奇藝大數據文件系統)實現存儲計算分離。QBFS是一個虛擬的文件系統,底層支持多種存儲類型(HDFS、公有云對象存儲、私有云對象存儲等),通過虛擬路徑與底層存儲的映射關係,實現計算任務在任何集羣都能訪問QBFS中的數據,即事實上的存儲計算分離。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"存儲計算分離在取得分層存儲降低存儲成本、跨集羣統一訪問、透明遷移等優勢的同時,勢必會帶來一些問題,如跨集羣訪問效率下降、網絡流量上升等。我們的應對措施有如下幾點:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"1)使用先進的壓縮算法、EC等技術,降低文件大小;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"2)通過優化文件格式,採用列式存儲,降低讀取的數據量;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"3)使用分佈式緩存技術(如Alluxio),避免對同一份數據的多次讀取(目前處於測試階段);"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"4)數據讀取與計算同時進行:以TensorFlow爲例,使用Dataset API實現數據讀取Pipeline,在計算的同時讀取下一步計算所需的數據,使得計算可以連續進行,數據讀取不成爲限制計算時長的因素。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"InfoQ:是否存在多租戶的問題?您們通過什麼技術手段解決這些供需關係?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"劉騁昺:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們的大數據+AI平臺是支持多租戶的,租戶的粒度是一個具體的業務或者項目。需要解決的問題有:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"1)平臺接入:用戶在平臺上提交任務,平臺以超級用戶代理爲業務用戶,提交任務到集羣,這裏用到了Hadoop的proxy user的機制;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"2)計算資源隔離:利用YARN的scheduler,業務根據需求申請計算隊列,管理員通過設置隊列的min、max、weight、max applications等屬性控制各種條件下一個隊列能夠申請的資源量,實現計算資源的分配與隔離。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"InfoQ:構建統一的大數據+AI平臺系統最容易出現的瓶頸是什麼?您們是如何解決的?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"劉騁昺:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在特徵數據處理方面,大量的計算任務會佔據大量的計算資源,拖慢整體的產出時間。平臺通過建立統一的特徵庫,實現基於配置的特徵計算,統一優化計算效率,並加強特徵複用,減少重複計算,使得產出時間得到保障。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在模型訓練方面,大規模分佈式訓練會佔用較多的CPU、內存、網絡帶寬等資源。平臺通過監控資源利用率,合理分佈不同類型的進程(如搭配內存需求大的任務和CPU需求大的任務),採用合適的機型等措施,提高資源利用率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"InfoQ:針對愛奇藝的場景,研發這套平臺時,您們做了哪些定製化的工作?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"劉騁昺:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"定製化的工作主要體現在特徵算子上。我們整理了特徵計算中常用的10多種計算邏輯,在啓用平臺之前,這些邏輯一般通過SQL表達,多種邏輯的組合使得SQL較長,可讀性較差。我們將這些邏輯抽象成算子,算子之間通過協同工作形成一張DAG圖,以此來代替SQL,增強邏輯的可讀性,並統一優化計算效率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"InfoQ:運行這套平臺之後,它對業務最大的改善目前體現在哪裏?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"劉騁昺:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"首先,大規模分佈式的模型訓練更加便利,而且性能也得到了大幅提升,業務可以採用更多的數據,更早產出模型,提升業務效果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"其次,特徵管理、生產、監控與複用爲業務提供了更規範化的方式,避免了煙囪式開發,提升了開發、計算效率和數據質量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"InfoQ:展望未來,您們看到可能的大數據+AI平臺的發展方向\/技術趨勢是什麼?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"劉騁昺:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在未來,隨着更多的業務場景採用機器學習,模型的複雜度越來越高,大數據平臺中AI計算所佔的比例會進一步提高。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在AI技術方面,由於短視頻和信息流技術的發展,實時化的online learning會被更多業務場景採用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"最後,隨着多方合作需求增加,以及國家對隱私保護和數據安全的法規逐步落地,以聯邦學習和多方安全計算爲代表的隱私計算技術會得到更廣泛的應用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"活動推薦"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"11 月 5-6 日,AICon 全球人工智能與機器學習技術大會將落地北京國際會議中心。包括主題演講在內,本次大會共設置了 NLP 技術與應用、人工智能前沿技術、通用機器學習技術、計算機視覺實踐、推薦廣告技術與實踐、AI 工程師團隊建設與管理、認知智能的前沿探索、AI 與產業互聯網結合、大數據計算和分析、智能語音前沿技術應用、大規模預訓練模型進展、自動駕駛技術等 14 個專題。目前大會門票限時 8 折特惠中,購票歡迎聯繫票務小姐姐文柳:13269078023(電話同微信)"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章