如何用Spark計算引擎執行FATE聯邦學習任務?

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上篇文章 "},{"type":"link","attrs":{"href":"http:\/\/mp.weixin.qq.com\/s?__biz=MzAwNzUyNzI5Mw==&mid=2730792443&idx=1&sn=29c88788690823851e77de241e26b164&chksm=bc4cfbe98b3b72ffe61a4ca97d4c61fbf41a69ac464a7c6122076d485a4957bd1dbf7b00442b&scene=21#wechat_redirect","title":null,"type":null},"content":[{"type":"text","text":"在Juypter Notebook中構建聯邦學習任務"}],"marks":[{"type":"underline"}]},{"type":"text","marks":[{"type":"underline"}],"text":" "},{"type":"text","text":"構建任務中提到,FATE 1.5 LTS 版本支持用戶使用 Spark 作爲底層的計算引擎,本文將對其實現細節以及使用進行簡單介紹,方便用戶在實際的使用過程中進行調優或者排查錯誤。"}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"使用分佈式計算引擎的意義"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 FATE 中一個比較重要的組件是 \"FATE Flow\",它負責對用戶任務的管理和調度。如官方文檔所示,\"FATE Flow\" 的工作模式有兩種,分別爲單機 (standalone) 和集羣(cluster) 模式。單機模式中,數據的存儲以及計算都在 \"FATE Flow\" 本地執行因此無法有效擴展,所以單機模式主要用於學習以及測試。而在集羣模式中,數據的存儲以及計算不再通過本地而是下發到分佈式的集羣中執行,而集羣的大小可以根據實際的業務需求來進行伸縮,可滿足不同數據集規模的需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"FATE 默認支持使用 \"eggroll\" 作爲其底下計算和存儲的集羣,在經過了不斷的迭代和優化之後目前已經能夠滿足大多數聯邦學習應用場景的需求。eggroll 本身是一個相對獨立的集羣,它對外提供一個統一的入口以及一組API,外部應用可以通過 RPC 調用的方式把任務發送到 eggroll 集羣上執行,而 eggroll 本身是支持橫向擴展的因此用戶可以根據實際場景調整集羣的規模。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"FATE 在 v1.5 中徹底重構了 Apache Spark 作爲計算引擎部分,並提供正式支持。Spark 是一個得到業界廣泛認可的內存型 (in-memory) 計算引擎,由於其簡單、高效和集羣管理工具成熟等特性,因此在許多公司的生產環境中被大規模部署和使用。這也是FATE支持使用它的主要原因之一。由於技術原因,目前FATE 對使用 Spark 的支持還不夠完善,暫時還不能滿足大規模樣本(千萬級別)的訓練任務,但優化的工作已在進行中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"KubeFATE 在 1.5 版本支持了 FATE On Spark 的部署,它可以通過容器的方式啓動一個 Spark 集羣來爲  FATE 提供計算服務,詳情可參考下面的章節:使用 Spark 作爲計算引擎。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"與具體計算引擎進行對接的是 FATE FLow 服務,因此我們將簡單分析該服務的結構,以弄明白它是如何跟不同的計算引擎交互。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"FATE Flow 結構簡介"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 FATE 1.5 中,\"FATE Flow\" 服務迎來了比較重大的重構,它把存儲、計算、聯邦傳輸(federation)等操作抽象成了不同的接口以供上層的應用使用。而接口在具體的實現中可以通過調用不同的庫來訪問不同的運行時 (runtime),通過這種方式可以非常容易地擴展對其他計算 (如spark) 或存儲 (HDFS、MySQL) 服務的支持。一般來說使用FATE進行聯邦學習任務可分爲以下步驟(假設組網已完成):"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"調用 FATE Flow 提供的接口上傳訓練用的數據集"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"定義訓練任務的 pipeline 並調用 FATE Flow 接口上傳任務"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"根據訓練的結果不斷調整訓練參數並得到最終模型"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"通過 FATE Flow 上傳預測用的數據集"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"定義預測任務並通過 FATE Flow 執行"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上述的步驟中,除了任務的調度必需要 FATE Flow 參與之外,存儲和計算等其他部分的工作都可以藉助別的服務來完成。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如下圖所示,\"FATE Flow\" 這個方框內列出了部分接口,其中:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"\"Storage Interface\"用於數據集的管理,如上傳本地數據、刪除上傳的數據等。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"\"Storage Meta Interface\"用於數據集的元數據管理。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"\"Computing Interface\"用於執行計算任務。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"\"Federation Interface\"用於在各個訓練參與方之間傳輸數據。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/6b\/6b5e93d6b50e59ac14884363150d7d15.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"綠色的方格是接口的具體實現,深灰色方格是用於跟遠端服務進行交互的客戶端,而藍色的方格則對應着獨立於 FATE Flow 服務之外的其他運行時。例如,對於計算接口來說,具體實現了該接口的類是 \"Table\",而 Table 的類型又分爲了兩種,其中一種使用 \"rollpair\" 來跟 \"eggroll\" 集羣交互;而另一種則使用 \"pyspark\" 中的 \"rdd\" 來跟 \"spark\" 集羣進行交互。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"使用不同計算引擎之間的差異"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上一節中提到,FATE Flow 通過抽象出來的接口可以使用不同的計算、存儲等服務,但由於依賴以及實現機制等原因,這些服務在選擇上有一定的制約,但具體可分爲兩類:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"使用eggroll作爲計算引擎"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"使用spark作爲計算引擎"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當使用 eggroll 作爲計算引擎時,FATE 的整體架構如下: "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/04\/0426d4c16b91471d4b993c9777af68b7.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"eggroll 集羣中有三種不同類型的節點,分別是 Cluster Manager、Node Manager 和 Rollsite,其中 Cluster Manager 負責提供服務入口以及分配資源,Node Manager 是實際執行計算和存儲的節點,而 Rollsite 則提供傳輸服務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而當使用 Spark 作爲計算引擎時,FATE 的整體架構如下: "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/2a\/2ac94a0dfb439df3e5ea080f08ec812c.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於Spark 本身是一個內存 (in-memory) 計算框架,一般需要藉助其他服務來持久化輸出,因此,要在 FATE 中使用 Spark 作爲計算引擎還需要藉助 HDFS 來實現數據的持久化。至於聯邦傳輸則分爲了兩部分,分別是指令 (pipeline) 的同步和訓練過程中消息的同步,它們分別藉助 nginx 和 rabbitmq 服務來完成。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"使用Spark作爲計算引擎"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"前置條件"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"前面的小節提到,要使用 Spark 作爲計算引擎還需要依賴於 Nginx、RabbitMQ 以及HDFS 等服務,對於完整的服務安裝部署以及配置有三種方式可供參考:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"基於裸機的集羣安裝部署可以參考FATE ON Spark部署指南。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"基於 \"docker-compose\" 的集羣安裝部署可以參考"},{"type":"link","attrs":{"href":"http:\/\/mp.weixin.qq.com\/s?__biz=MzAwNzUyNzI5Mw==&mid=2730792409&idx=1&sn=56c11698966cbfe10ded2586d4133417&chksm=bc4cfbcb8b3b72ddc8d90d949a6f4906ccc25a679236aeb6a310ed1e703425b14ff5beb1a645&scene=21#wechat_redirect","title":null,"type":null},"content":[{"type":"text","text":"使用Docker Compose 部署 FATE"}],"marks":[{"type":"underline"}]},{"type":"text","text":",只需要把配置文件中的"},{"type":"codeinline","content":[{"type":"text","text":"computing_backend"}]},{"type":"text","text":"設置成"},{"type":"codeinline","content":[{"type":"text","text":"spark"}]},{"type":"text","text":"即可,在真正部署的時候會以容器的方式拉起 HDFS、Spark 等服務。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"基於 Kubernetes 的集羣部署可以參考"},{"type":"link","attrs":{"href":"http:\/\/mp.weixin.qq.com\/s?__biz=MzAwNzUyNzI5Mw==&mid=2730792470&idx=1&sn=917572f59a13638dad7711db9362cf75&chksm=bc4cf8048b3b7112943fe00f1934608584e1905603fb1900a9731fed04b4948097ffed9f2e3a&scene=21#wechat_redirect","title":null,"type":null},"content":[{"type":"text","text":"Kubernetes部署方案"}],"marks":[{"type":"underline"}]},{"type":"text","text":",創建集羣時使用 cluster-spark.yaml 文件即可創建一個基於 K8s 的 FATE On Spark 集羣,至於Spark 節點的數量也可以在這個文件中定義。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"注意"},{"type":"text","text":":目前經過驗證的 Spark、HDFS 以及 RabbitMQ 的版本分別爲2.4,2.7和3.8"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於想要利用現有 Spark 集羣的用戶來說,除了需要部署其他依賴的服務之外還需要解決 FATE 的 python 包依賴,具體措施是給所有要運行聯邦學習工作負載的Spark節點進行如下操作:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"創建新目錄放置文件"}]}]}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"$ mkdir -p \/data\/projects\n$ cd \/data\/projects"}]},{"type":"numberedlist","attrs":{"start":"2","normalizeStart":"2"},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"下載和安裝\"miniconda\""}]}]}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"$ wget https:\/\/repo.anaconda.com\/miniconda\/Miniconda3-4.5.4-Linux-x86_64.sh\n$ sh Miniconda3-4.5.4-Linux-x86_64.sh -b -p \/data\/projects\/miniconda3\n$ miniconda3\/bin\/pip install virtualenv\n$ miniconda3\/bin\/virtualenv -p \/data\/projects\/miniconda3\/bin\/python3.6 --no-wheel --no-setuptools --no-download \/data\/projects\/python\/venv"}]},{"type":"numberedlist","attrs":{"start":"3","normalizeStart":"3"},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"下載FATE項目代碼"}]}]}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"$ git clone https:\/\/github.com\/FederatedAI\/FATE\/tree\/v1.5.0\n\/\/ 添加python依賴路徑\n$ echo \"export PYTHONPATY=\/data\/projects\/fate\/python\" >> \/data\/projects\/python\/venv\/bin\/activate"}]},{"type":"numberedlist","attrs":{"start":"4","normalizeStart":"4"},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"進入python的虛擬環境"}]}]}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"$ source \/data\/projects\/python\/venv\/bin\/activate"}]},{"type":"numberedlist","attrs":{"start":"5","normalizeStart":"5"},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"修改並下載python庫"}]}]}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"\/\/ 剔除tensorflow和pytorch依賴\n$ sed -i -e '23,25d' .\/requirements.txt\n$ pip install setuptools-42.0.2-py2.py3-none-any.whl\n$ pip install -r \/data\/projects\/python\/requirements.txt"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"至此,依賴安裝完畢,之後在 FATE 中提交任務時還需通過配置"},{"type":"codeinline","content":[{"type":"text","text":"spark.pyspark.python"}]},{"type":"text","text":"來指定使用該 python 環境。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"示例"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當 Spark 集羣準備完畢後就可以在 FATE 中使用它了,使用的方式則是在任務的定義中把"},{"type":"codeinline","content":[{"type":"text","text":"backend"}]},{"type":"text","text":"設置爲"},{"type":"codeinline","content":[{"type":"text","text":"1"}]},{"type":"text","text":",這樣 \"FATE Flow\" 就會在後續的調度中,通過 \"spark-submit\" 工具把提交到 Spark 集羣上執行。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"需要注意的是,雖然 Spark 集羣的 master 可以在任務的配置中指定,但是 HDFS、RabbitMQ 以及 Nginx 等服務需要在 FATE Flow 啓動之前通過配置文件的方式指定,因此當這些服務地址發生改變時,需要更新配置文件並重啓 FATE Flow 服務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當 FATE Flow 服務正常啓動後,可以使用\"toy_example\"來驗證環境。步驟如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"修改toy_example_config文件如下:"}]}]}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"{\n \"initiator\": {\n \"role\": \"guest\",\n \"party_id\": 9999\n },\n \"job_parameters\": {\n \"work_mode\": 0,\n \"backend\": 1,\n \"spark_run\": {\n \"executor-memory\": \"4G\",\n \"total-executor-cores\": 4\n }\n },\n \"role\": {\n \"guest\": [\n 9999\n ],\n \"host\": [\n 9999\n ]\n },\n \"role_parameters\": {\n \"guest\": {\n \"secure_add_example_0\": {\n \"seed\": [\n 123\n ]\n }\n },\n \"host\": {\n \"secure_add_example_0\": {\n \"seed\": [\n 321\n ]\n }\n }\n },\n \"algorithm_parameters\": {\n \"secure_add_example_0\": {\n \"partition\": 4,\n \"data_num\": 1000\n }\n }\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其中"},{"type":"codeinline","content":[{"type":"text","text":"spark_run"}]},{"type":"text","text":"裏面定義了要提供給\"spark-submit\"的參數,因此還可以通過"},{"type":"codeinline","content":[{"type":"text","text":"master"}]},{"type":"text","text":"來指定master的地址、或者"},{"type":"codeinline","content":[{"type":"text","text":"conf"}]},{"type":"text","text":"來指定"},{"type":"codeinline","content":[{"type":"text","text":"spark.pyspark.python"}]},{"type":"text","text":"的路徑等,一個簡單的例子如下:"}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"...\n \"job_parameters\": {\n \"work_mode\": 0,\n \"backend\": 0,\n \"spark_run\": {\n \"master\": \"spark:\/\/127.0.0.1:7077\"\n \"conf\": \"spark.pyspark.python=\/data\/projects\/python\/venv\/bin\/python\"\n },\n },\n..."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果沒有設置"},{"type":"codeinline","content":[{"type":"text","text":"spark_run"}]},{"type":"text","text":"字段則默認讀取"},{"type":"codeinline","content":[{"type":"text","text":"${SPARK_HOME}\/conf\/spark-defaults.conf"}]},{"type":"text","text":"中的配置。更多的關於spark的參數可參考Spark Configuration。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":"2","normalizeStart":"2"},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"提交任務以及查看任務狀態 運行以下命令提交任務運行\"toy_example\"。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"$ python run_toy_example.py -b 1 9999 9999 1"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"查看 fate_board :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/00\/007076028cedeaf79c0e218b1305210e.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"查看 Spark 的面板 :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/6c\/6c2b0856fa872a687b5d7e1e2809aae8.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據上面的輸出可以看到,FATE 通過 Spark 集羣成功運行了\"toy_example\"的測試。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文主要梳理了分佈式系統對FATE的重要性,同時也對比了 FATE 所支持兩種計算引擎 \"eggroll\" 和 \"spark\" 之間的差異,最後詳細描述瞭如何在 FATE 中使用 Saprk 運行任務。由於篇幅有限,關於在如何 FATE 中使用 Spark 的部分只作了簡單介紹,更多的內容如節點資源分配、參數調優等還待用戶探索。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"與 EggRoll 相比,目前 FATE 對 Spark 作爲計算引擎的支持還在完善中,相信再經過幾個版本的迭代,其使用體驗和穩定性以及效率上會達到更高的水準。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文經授權轉載自公衆號【亨利筆記】,作者系 VMware 中國研發雲原生實驗室工程師,聯邦學習開源項目 KubeFATE \/ FATE-Operator 維護者,原文鏈接:"},{"type":"link","attrs":{"href":"https:\/\/mp.weixin.qq.com\/s\/HJk9KtM2BHvU32lyyWbjfQ","title":"xxx","type":null},"content":[{"type":"text","text":"《用Spark計算引擎執行FATE聯邦學習任務》"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"相關文章"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"http:\/\/mp.weixin.qq.com\/s?__biz=MzAwNzUyNzI5Mw==&mid=2730792455&idx=2&sn=e9e8965670aa375abc792fd09b5e457d&chksm=bc4cf8158b3b710378314d8b0cc57306ccd0afb3b0b94e29a5cb2cb4fa282c4bc1606cb124c6&scene=21#wechat_redirect","title":null,"type":null},"content":[{"type":"text","text":"在Juypter Notebook中構建聯邦學習任務"}],"marks":[{"type":"underline"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"http:\/\/mp.weixin.qq.com\/s?__biz=MzAwNzUyNzI5Mw==&mid=2730792470&idx=1&sn=917572f59a13638dad7711db9362cf75&chksm=bc4cf8048b3b7112943fe00f1934608584e1905603fb1900a9731fed04b4948097ffed9f2e3a&scene=21#wechat_redirect","title":null,"type":null},"content":[{"type":"text","text":"雲原生聯邦學習平臺 KubeFATE 原理詳解"}],"marks":[{"type":"underline"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"http:\/\/mp.weixin.qq.com\/s?__biz=MzAwNzUyNzI5Mw==&mid=2730792448&idx=1&sn=87d8e479dac4b0f3f6521e0da389f594&chksm=bc4cf8128b3b71049b041b4f49156c5faa080897504fb346dd1d92729f0118260c9efd537713&scene=21#wechat_redirect","title":null,"type":null},"content":[{"type":"text","text":"用KubeFATE在K8s上部署聯邦學習FATE v1.5"}],"marks":[{"type":"underline"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"http:\/\/mp.weixin.qq.com\/s?__biz=MzAwNzUyNzI5Mw==&mid=2730792409&idx=1&sn=56c11698966cbfe10ded2586d4133417&chksm=bc4cfbcb8b3b72ddc8d90d949a6f4906ccc25a679236aeb6a310ed1e703425b14ff5beb1a645&scene=21#wechat_redirect","title":null,"type":null},"content":[{"type":"text","text":"使用Docker Compose 部署FATE v1.5"}],"marks":[{"type":"underline"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"相關專題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.infoq.cn\/theme\/62","title":"xxx","type":null},"content":[{"type":"text","text":"《如何基於FATE架構從0到1部署聯邦學習?》"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章