豐富 TF Serving 生態,愛奇藝開源靈活高性能的推理系統 XGBoost Serving

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲彌補目前社區在生產環境可用的支持 ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"GBDT 模型、GBDT+FM 二分類模型及 GBDT+FM 多分類模型","attrs":{}},{"type":"text","text":"部署的推理系統的空白,愛奇藝設計開發了靈活、高性能的 ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"XGBoost Serving","attrs":{}},{"type":"text","text":" 推理系統,並在內部多個業務落地使用。近期,愛奇藝決定將這一系統","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"開源","attrs":{}},{"type":"text","text":",本文將詳細介紹項目","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"開發背景、系統實踐、系統特性和架構及實現","attrs":{}},{"type":"text","text":"等內容。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"01 系統背景及簡介","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2014年,Facebook首先提出了","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"GBDT + LR(Logistics Regression)","attrs":{}},{"type":"text","text":"[5] 的模型結構,利用 GBDT 自動組合特徵生成新的特徵向量,該特徵向量再作爲輸入進入 LR 計算得到最後的結果,實驗表明該模型結構較單獨使用 GBDT 或者 LR ,能","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"使效果提升至少 3%。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現代推薦系統或廣告系統通常會使用","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"高維稀疏特徵","attrs":{}},{"type":"text","text":"如","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"視頻 ID 特徵","attrs":{}},{"type":"text","text":"等來增強模型的記憶能力,由於 GBDT 不支持高維稀疏特徵,如果使用 GBDT + LR 模型結構,則不可避免地要在 LR 模型中人工做特徵組合,同時模型的計算複雜度也會變高。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,愛奇藝內部目前","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"使用 FM 來替換 LR 模型部分,","attrs":{}},{"type":"text","text":"該模型結構支持高維稀疏特徵及自動二階特徵組合。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"實驗表明,GBDT + FM 模型較 GBDT + LR 模型效果提升至少 4%。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"GBDT + FM 模型經過訓練及評估後,需要在推理系統上線落地才能產生實際的收益。社區現有的成熟推理系統 TensorFlow Serving [6] ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"只支持 TenosorFlow 模型,且沒有明確計劃支持其它框架的模型","attrs":{}},{"type":"text","text":";其它推理系統諸如 zoltar [7]、kfserving XGBoost Server [8] 等只支持單獨的 GBDT 或者 FM 模型推理,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"不支持多模型、多版本部署,不支持模型生命週期管理","attrs":{}},{"type":"text","text":",且其非 C++ 原生實現性能較差。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,我們針對生產環境設計並開發了靈活、高性能的 XGBoost Serving 推理系統,支持 GBDT + FM 模型的在線推理。XGBoost Serving ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"是在 TensorFlow Serving 的基礎上開發的","attrs":{}},{"type":"text","text":",增加了 XGBoost Servable、alphaFM Servable 及 alphaFM_softmax Servable,分別用於支持純 GBDT 模型在線推理、GBDT + FM 二分類模型在線推理及 GBDT + FM 多分類模型在線推理。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"XGBoost Serving 現已在GitHub 開源,地址爲:","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//github.com/iqiyi/xgboost-serving","title":null,"type":null},"content":[{"type":"text","text":"https://github.com/iqiyi/xgboost-serving","attrs":{}}],"marks":[{"type":"strong"}]},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"size","attrs":{"size":10}}],"text":"注:GBDT(Gradient Boosted Decision Tree) [1] 是常用的機器學習算法之一,其具有較高的準確率和出色的特徵組合能力。XGBoost(eXtreme Gradient Boosting) [2] 是支持 GBDT 算法的一個高性能、可擴展、分佈式實現,在 Kaggle 等數據科學競賽及企業生產環境具有廣泛應用。FM(Factorization Machines) [4] 是一個支持特徵交叉的機器學習算法,對稠密特徵和稀疏特徵均具有良好的適用性,且其推理計算複雜度呈線性,在計算廣告和推薦系統的 CTR(Click Through Rate) 點擊率預估環節具有廣泛應用。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"02 推薦中臺落地 XGBoost Serving實踐","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"XGBoost Serving具體是如何落地的?效果又如何呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本節先以愛奇藝推薦中臺落地 GBDT + FM 二分類模型爲例,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"介紹如何使用 XGBoost Serving 推理系統。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在沒有 XGBoost Serving 推理系統的年代,愛奇藝推薦中臺在推薦引擎中支持 GBDT + FM 二分類模型部署。推薦引擎是整個推薦流程中非常複雜的一個模塊,由於算法規則、深度模型、淺層模型盤根錯節,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"很難保證計算過程不會相互干擾","attrs":{}},{"type":"text","text":";多模型、多版本的生命週期管理需要","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"人工介入","attrs":{}},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有了 XGBoost Serving 推理系統後,推薦中臺可以將模型部署及模型生命週期管理部分託管至 XGBoost Serving,自身聚焦於引擎業務邏輯調用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如此一來,引擎不再需要關注模型計算部分,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"且人工管理模型生命週期的日子也一去不復返了","attrs":{}},{"type":"text","text":"。推薦中臺使用 XGBoost Serving 的流程如圖1:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d2/d2fb4684a6d6090bbebecdfbd5125bdd.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖1推薦中臺使用 XGBoost Serving流程","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"推薦中臺訓練好 GBDT + FM 二分類模型後,指定數字版本號,並將該版本的模型放入配置的路徑下,XGBoost Serving 即會根據","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"配置的 Servables 類型和 Version Policy","attrs":{}},{"type":"text","text":" 確定是否執行加載模型的操作。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果模型加載成功,則 XGBoost Serving 會分別啓動 ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"GRPC 服務、HTTP 服務和 Metrics 服務。","attrs":{}},{"type":"text","text":"更新模型時,只需使用新的版本號命名模型並將該版本的模型放入配置的路徑下即可,XGBoost Serving 會","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"自動管理配置路徑下的模型的生命週期。","attrs":{}},{"type":"text","text":"部署及更新操作可通過流水線任務流進一步自動化。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"模型部署成功後,推薦中臺在引擎側使用 XGBoost Serving 提供的 GRPC 服務訪問對應的模型,使用 HTTP 服務查看各模型的狀態,使用 ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Metrics 服務監控模型的計算延時分佈。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"同等負載壓力下,使用 XGBoost Serving 後,P99 長尾計算延時較引擎側部署至少降低 50%。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"瞭解了具體的落地效果後,我們再來詳細瞭解下 XGBoost Serving的系統特性和架構","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"實現細節。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"03 系統架構及實現","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"XGBoost Serving 是在 TensorFlow Serving 的基礎上開發的,爲了瞭解 XGBoost Serving 的架構及實現,首先需要了解 TensorFlow Serving 中的核心概念及架構。TensorFlow Serving 主要有以下","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"核心概念:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(1)Servables","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Servables 是 TensorFlow Serving 中最核心的抽象,表示用來執行具體計算的對象,如 TensorFlow SavedModel。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(2) Servable Versions","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1 個 Servable 可以擁有多個 Versions。TensorFlow Serving 支持同時加載多個 Servable Versions。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(3) Loaders","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Loaders 負責管理 Servables 的生命週期,提供標準化的獨立於具體算法或者數據的加載和卸載 Servables 的 APIs。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(4) Sources","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Sources 負責發現 Servables。Sources 支持本地文件系統等任意類型的存儲系統。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(5) Aspired Versions","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Aspired Versions 表示一組需要加載的 Servable Versions。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"(6) Managers","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Managers 負責管理 Servables 的完整生命週期,包括加載 Servables、服務 Servables 及卸載 Servables。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"TensorFlow Servable 的生命週期如圖2:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/bd/bdb784abc871cc19cd9eb004348c481b.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖 2 TensorFlow Servable 生命週期","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Source 監聽文件系統發現新的 Servable Version 後,創建 1 個 Loader,該 Loader 包括 1 個指向文件系統中模型數據的指針。Source 通過回調函數將 ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Aspired Versions","attrs":{}},{"type":"text","text":" 傳遞給 Manager。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Manager 根據配置的 Version Policy 執行加載新 Version 或者卸載舊 Version 的操作。Client 通過 Manager 來訪問 Servable,可以訪問指定版本的 Servable 或者缺省訪問最新版本的 Servable。瞭解了 TensorFlow Serving 的","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"核心概念及架構","attrs":{}},{"type":"text","text":"後,我們發現開發 XGBoost Serving ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"最核心的部分是開發新的 Servables 支持 GBDT + FM 二分類模型計算、GBDT + FM 多分類模型計算及純 GBDT 模型計算,然後將新的 Servables 集成到 Model Server 中,從而對外提供服務。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"XGBoost Serving 的架構如圖3:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c1/c1cfd3aa3cde93f27ff5f5d5b1b18870.jpeg","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖 3 XGBoost Serving 架構","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"XGBoost Serving 的架構自底向上共分爲 5 層:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1) Source","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2) Servables","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(3) AspiredVersionsManager","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(4) Predictors","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(5) Servers","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Source 層通過定期 poll 文件系統中指定的路徑來發現 Servables,支持配置 poll 文件系統的路徑、poll 文件系統的時間間隔及 Servables 的加載策略等。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Servables 層共支持 3 種 Servables:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1) XGBoost Servable","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2) alphaFM Servable","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(3) alphaFM_softmax Servable","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"XGBoost Servable 支持純 GBDT 模型的在線推理,其支持的模型爲","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"XGBoost導出的二進制 GBDT 模型。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"alphaFM Servable","attrs":{}},{"type":"text","text":" 支持 GBDT + FM 二分類模型的在線推理,其合法的模型包括 ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"GBDT模型、FM 二分類模型及 FeatureMapping 模型","attrs":{}},{"type":"text","text":"3部分,GBDT 模型爲 XGBoost 導出的二進制 GBDT 模型,FM 二分類模型爲 FM 二分類算法訓練導出的模型,FeatureMapping 模型爲 GBDT 模型葉子節點轉換爲特徵映射的模型文件。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"alphaFM_softmax Servable","attrs":{}},{"type":"text","text":" 支持 GBDT + FM 多分類模型的在線推理,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"其合法的模型包括 GBDT模 型、FM 多分類模型及 FeatureMapping 模型 3 部分,GBDT 模型和 FeatureMapping 模型同 alphaFM Servable 支持的相應模型","attrs":{}},{"type":"text","text":",FM 多分類模型爲 FM 多分類算法訓練導出的模型。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"AspiredVersionsManager","attrs":{}},{"type":"text","text":"層通過 aspired-versions 的回調函數來確定加載及卸載哪些 Servables。AspiredVersionsManager 使用 AvailabilityPreservingPolicy 控制策略。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Predictors 層共支持3 種 Predictors:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1)XGBoost Predictor","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2)alphaFM Predictor","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(3)alphaFM_softmax Predictor","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"XGBoost Predictor 使用 XGBoost Servable 計算,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"支持計算葉子節點的索引和計算值。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"alphaFM Predictor 使用 alphaFM Servable 計算,GBDT 輸入特徵經過 GBDT 模型計算得到葉子節點的索引,葉子節點的索引經過 FeatureMapping 轉換爲","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"特徵 ID,GBDT 輸入特徵、FeatureMapping 轉換的特徵以及 FM 輸入特徵","attrs":{}},{"type":"text","text":"最終進入 FM 二分類模型進行計算並返回結果。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"alphaFM_softmax Predictor 使用 alphaFM_softmax Servable 計算,計算過程和 alphaFM Predictor 相似,不同之處爲 GBDT 輸入特徵、FeatureMapping 轉換的特徵以及 FM 輸入特徵最終進入 FM 多分類模型進行計算並返回結果。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Servers 層共支持 3 種類型的 Servers:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1)GRPC Server","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2) HTTP Server","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(3)Metrics Server","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"GRPC Server 對外提供 ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"GRPC 在線推理服務","attrs":{}},{"type":"text","text":",對不同類型的 Servables 提供不同的 APIs。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"HTTP Server 對外提供 ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"HTTP 模型管理服務","attrs":{}},{"type":"text","text":",包括查看模型各版本的狀態等。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Metrics Server 對外提供 ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Metrics 監控服務","attrs":{}},{"type":"text","text":",包括 QPS、計算延時分佈等監控數據。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"04 總結","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文介紹了 XGBoost Serving 推理系統,包括","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"開發背景、系統實踐、系統特性和架構及實現","attrs":{}},{"type":"text","text":"等部分,該系統填補了社區沒有生產環境可用的支持 GBDT 模型、GBDT + FM 二分類模型及 GBDT + FM 多分類模型部署的推理系統的空白,已經在愛奇藝內部多個業務落地使用,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"現在 GitHub 上開源","attrs":{}},{"type":"text","text":",詳見:","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//github.com/iqiyi/xgboost-serving","title":null,"type":null},"content":[{"type":"text","text":"https://github.com/iqiyi/xgboost-serving","attrs":{}}]},{"type":"text","text":",歡迎使用、反饋 Issues 及提交 Pull-Requests。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"05 參考文獻","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[1] J. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29(5):1189–1232, 2001.","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[2] Tianqi Chen and Carlos Guestrin. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM; 2016. p. 785–794.","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[3] ","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//github.com/dmlc/xgboost","title":null,"type":null},"content":[{"type":"text","text":"https://github.com/dmlc/xgboost","attrs":{}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[4] S. Rendle, Factorization machines. In Proceedings of IEEE International Conference on Data Mining (ICDM), pp. 995–1000, 2010.","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[5] Xinran He et al, Practical Lessons from Predicting Clicks on Ads at Facebook. In ADKDD’14, August 24 - 27 2014, New York, NY, USA.","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[6] ","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//github.com/tensorflow/serving","title":null,"type":null},"content":[{"type":"text","text":"https://github.com/tensorflow/serving","attrs":{}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[7] ","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//spotify.github.io/zoltar/","title":null,"type":null},"content":[{"type":"text","text":"https://spotify.github.io/zoltar/","attrs":{}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[8] ","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//github.com/kubeflow/kfserving","title":null,"type":null},"content":[{"type":"text","text":"https://github.com/kubeflow/kfserving","attrs":{}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[9] ","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//github.com/CastellanZhang/alphaFM","title":null,"type":null},"content":[{"type":"text","text":"https://github.com/CastellanZhang/alphaFM","attrs":{}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[10]","attrs":{}},{"type":"link","attrs":{"href":"https://link.zhihu.com/?target=https%3A//github.com/CastellanZhang/alphaFM_softmax","title":null,"type":null},"content":[{"type":"text","text":"https://github.com/CastellanZha","attrs":{}}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章