同程旅行大數據集羣在 Kubernetes 上的服務化實踐

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"前言"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同程旅行大數據集羣從 2017 年開始容器化改造,經歷了自研調度 docker 容器 ,到現在的"},{"type":"codeinline","content":[{"type":"text","text":"雲艙"}]},{"type":"text","text":"平臺,採用 "},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":" 調度編排工具管理大數據集羣服務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/fd\/d2\/fd07cba5e9553f999d2c3c5f240d5ed2.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在這個過程中遇到很多問題和難點,本文會向大家介紹上雲過程中總結的經驗和教訓。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"今天的議題主要分下面幾點來闡述:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲什麼要將大數據集羣服務搬到Kubernetes上"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上雲的過程遇到哪些痛點"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大數據服務上雲攻略"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現狀和未來發展"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"集羣即服務的理念"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"部門內部很早就提出集羣即服務的理念,作爲基礎組件研發,希望從產品的角度來看待組件或者集羣,讓業務研發能直接觸達底層集羣,可以包含節點、日誌、監控等功能,讓集羣使用更簡單。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"推行小集羣化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以前組件研發部署一個組件集羣,這個集羣會陸續承接一些業務,時常會遇到A業務影響B業務,集羣負責人會開始考慮拆分,搭建出一個新集羣將消耗資源的業務拆分出去。這種是以人工介入的方式去評估業務體量並分配資源。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現在部門開始推行小集羣模式,每個業務研發組都可以申請一個或者多個集羣,在物理層面做到資源隔離,互不影響,不會因爲A業務的流量上升而影響其他業務。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"自動化運維建設"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"小集羣化會導致集羣數量成倍的上升,如果不做自動化運維,人力會遠遠跟不上業務增長,到那時組件研發會淹沒在救火和運維的海洋。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以需要構建一個集羣全流程自動化平臺。這裏麪包含服務申請,服務部署,服務運維等功能。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/89\/a7\/897a240def1127cb6b6cdfed2ffec4a7.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"如何利用 Kubernetes 利器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"起初自研編排工具去調度容器,但是實現的東西太多,在人力有限的情況下,認爲這條路不可行。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2019年開始採用 "},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":" 調度編排容器,先後採取過用"},{"type":"codeinline","content":[{"type":"text","text":"Helm"}]},{"type":"text","text":" 工具編寫模板部署組件,用"},{"type":"codeinline","content":[{"type":"text","text":"Operator"}]},{"type":"text","text":"的方式管理服務,用Statefulset\/Deployment 部署大數據集羣。這些方式最後都被放棄。Helm 只是解決了部署的問題,想要基於 "},{"type":"codeinline","content":[{"type":"text","text":"Helm"}]},{"type":"text","text":" 做平臺精細化運維比較麻煩。Operator的理念是針對某個組件做自定義CRD,大數據服務有十幾種組件,爲每個組件專門定製Operator,運維和開發成本過大,基於此還要解決Operator和平臺層的交互邏輯,這個也不適合同程的人力配比。"},{"type":"codeinline","content":[{"type":"text","text":"Statefulset"}]},{"type":"text","text":"和"},{"type":"codeinline","content":[{"type":"text","text":"Deployment"}]},{"type":"text","text":" 沒法做到精細化運維,比如業務提出關閉某個指定的點,當業務邏輯和底層運維邏輯耦合在一起的時候,已經封裝好的 Workload 並不能拿來即用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於是大數據生態,同程選擇採用"},{"type":"codeinline","content":[{"type":"text","text":"Java Client"}]},{"type":"text","text":" 和 "},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":" 進行交互,在"},{"type":"codeinline","content":[{"type":"text","text":"Kuberentes"}]},{"type":"text","text":" 上自研 "},{"type":"codeinline","content":[{"type":"text","text":"雲艙"}]},{"type":"text","text":" 調度器,將運維側業務邏輯和平臺交互代碼放在一起,構建了一套適合自己的大數據服務自動化運維框架,當前覆蓋了幾乎所有的大數據服務,計算組件有Hive、Presto、Yarn,存儲組件有 HDFS、ClickHouse、Kafka、Kudu等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/7a\/e2\/7af40a43862647261e9a2fc5d76ba0e2.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"上雲過程遇到了哪些痛點"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Kubernetes 環境問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於大數據組件有很多是分佈式存儲系統,組件本身會要求客戶端和服務端能夠網絡互通,端到端的建立連接。這就需要"},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":"容器網絡要和外部物理網絡打通,當然也可以採用"},{"type":"codeinline","content":[{"type":"text","text":"Proxy"}]},{"type":"text","text":"層來屏蔽底層存儲。同程大數據選擇構建 "},{"type":"codeinline","content":[{"type":"text","text":"Underlay"}]},{"type":"text","text":" 的容器網絡,做到IP保持,容器IP提前分配,IP自動回收等功能。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"將"},{"type":"codeinline","content":[{"type":"text","text":"Service"}]},{"type":"text","text":"層網絡和公司四層負載 "},{"type":"codeinline","content":[{"type":"text","text":"TVS"}]},{"type":"text","text":" 服務做到很好的集成,利用Endpoints和Service 事件監聽來保證負載數據的一致性。由於網絡環境的限制,一個機房沒有辦法只搭建一個Kuberntes集羣,需要支持一個應用跨多"},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":"集羣部署,負載服務要支持跨多個"},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":"集羣的應用負載。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"codeinline","content":[{"type":"text","text":"DNS"}]},{"type":"text","text":" 層採用子域的方式做到Kubernetes 內部"},{"type":"codeinline","content":[{"type":"text","text":"CoreDns"}]},{"type":"text","text":" 和公司"},{"type":"codeinline","content":[{"type":"text","text":"DNS"}]},{"type":"text","text":"服務器數據同步,保證一致性,保證內外部域名通信一致。由於一些組件遷移的需求,需要提供在容器拉起來之前預先配置"},{"type":"codeinline","content":[{"type":"text","text":"DNS"}]},{"type":"text","text":"和"},{"type":"codeinline","content":[{"type":"text","text":"IP"}]},{"type":"text","text":"映射的功能,所以只好根據已知的"},{"type":"codeinline","content":[{"type":"text","text":"Pod"}]},{"type":"text","text":"標識,提前分配IP。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"基於Pod的方式管理容器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"剛開始的時候採用"},{"type":"codeinline","content":[{"type":"text","text":"Statefulset"}]},{"type":"text","text":"來部署一些服務,一些開源的Operator也是基於"},{"type":"codeinline","content":[{"type":"text","text":"STS"}]},{"type":"text","text":"管理服務,比如我正在持續貢獻的 "},{"type":"codeinline","content":[{"type":"text","text":"TiDB Operator"}]},{"type":"text","text":" 、"},{"type":"codeinline","content":[{"type":"text","text":"Prometheus Operator"}]},{"type":"text","text":"。雖然可以複用已有Workload的功能,但是當場景複雜,這麼做反而會縫縫補補。大數據組件就是這樣一個複雜的場景,所以決定採用純"},{"type":"codeinline","content":[{"type":"text","text":"Pod"}]},{"type":"text","text":"管理容器,基於Pod去組裝成 "},{"type":"codeinline","content":[{"type":"text","text":"Group"}]},{"type":"text","text":"。比如HDFS組件,會拆分成 "},{"type":"codeinline","content":[{"type":"text","text":"namenode"}]},{"type":"text","text":" 、"},{"type":"codeinline","content":[{"type":"text","text":"journalnode"}]},{"type":"text","text":"、"},{"type":"codeinline","content":[{"type":"text","text":"datanode"}]},{"type":"text","text":" 這三個"},{"type":"codeinline","content":[{"type":"text","text":"Group"}]},{"type":"text","text":",每個"},{"type":"codeinline","content":[{"type":"text","text":"Group"}]},{"type":"text","text":"可以理解爲是同一種節點類型的容器。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/6d\/17\/6d9e26daedab4a1a67ab2344324a6b17.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"Pod 配置有狀態"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"存儲組件有個明顯的特性就是配置文件中會有一個唯一標識,比如"},{"type":"codeinline","content":[{"type":"text","text":"Zookeeper"}]},{"type":"text","text":"的 "},{"type":"codeinline","content":[{"type":"text","text":"myid"}]},{"type":"text","text":" , "},{"type":"codeinline","content":[{"type":"text","text":"Kafka"}]},{"type":"text","text":" 的 "},{"type":"codeinline","content":[{"type":"text","text":"broker id"}]},{"type":"text","text":"。將老集羣逐步遷移到"},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":"上的時候,這些配置項需要自定義且持久化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/6b\/bc\/6b5e35997a54d35bf7ac782dec5fe3bc.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果組件本身的配置文件格式比較固定,會做成模板化,將特定的配置項抽出來提供給組件研發配置,通過環境變量的方式注入到容器中。對於自定義特別強的組件,會基於"},{"type":"codeinline","content":[{"type":"text","text":"ConfigMap"}]},{"type":"text","text":"做配置的版本控制,讓組件研發可以很方便的填寫配置並推送配置,"},{"type":"codeinline","content":[{"type":"text","text":"ClickHouse"}]},{"type":"text","text":" 就是非常自定義配置的組件。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"以虛擬機的方式啓動容器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"用"},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":"部署有狀態服務的時候,由於配置錯誤會導致容器反覆"},{"type":"codeinline","content":[{"type":"text","text":"crash"}]},{"type":"text","text":",這個時候組件研發只希望快速進入現場排查問題,所以針對存儲類組件均採用"},{"type":"codeinline","content":[{"type":"text","text":"tail -F"}]},{"type":"text","text":"的方式啓動容器,讓服務進程作爲後臺進程啓動,配置完善的健康檢查,快速發現節點的不健康性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這種方式雖然違反了"},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":"的設計原則,但是易用性會顯著提升。在部署Yarn組件的時候,由於"},{"type":"codeinline","content":[{"type":"text","text":"tail -F"}]},{"type":"text","text":"命令爲主進程,導致大量殭屍進程,最後改用"},{"type":"codeinline","content":[{"type":"text","text":"bash"}]},{"type":"text","text":"命令啓動。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"資源異構問題和多盤掛載問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在部署 Yarn 組件過程中,由於機器規格的問題,導致同一個應用節點之間的資源配置不一樣,我們設計採用劃分資源池,將相同規格的機器分爲一個資源池,一個應用根據資源池的配置來調整合適的資源。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在"},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":"中使用本地盤,一般會推薦"},{"type":"codeinline","content":[{"type":"text","text":"localpv"}]},{"type":"text","text":"的方式,大數據某些組件會採用多盤寫入的方式部署,"},{"type":"codeinline","content":[{"type":"text","text":"local pv"}]},{"type":"text","text":"的方式並不能解決這個問題。同程大數據選擇採用"},{"type":"codeinline","content":[{"type":"text","text":"hostpath"}]},{"type":"text","text":"+"},{"type":"codeinline","content":[{"type":"text","text":"nodeselector"}]},{"type":"text","text":"的方式來做到多盤綁定且節點不漂移。在提交給"},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes Scheduler"}]},{"type":"text","text":"之前,會在"},{"type":"codeinline","content":[{"type":"text","text":"雲艙Scheduler"}]},{"type":"text","text":"基於資源池和節點信息對容器提前做一層調度。起初準備用"},{"type":"codeinline","content":[{"type":"text","text":"hostpath"}]},{"type":"text","text":"+"},{"type":"codeinline","content":[{"type":"text","text":"nodename"}]},{"type":"text","text":"的方式來做到節點不漂移,但是"},{"type":"codeinline","content":[{"type":"text","text":"nodename"}]},{"type":"text","text":" 會跳過 Scheduler update 步驟,並不會進行 "},{"type":"codeinline","content":[{"type":"text","text":"bind"}]},{"type":"text","text":" pvc等步驟。詳情可以參考 "},{"type":"link","attrs":{"href":"https:\/\/github.com\/kubernetes\/kubernetes\/issues\/93145","title":"","type":null},"content":[{"type":"text","text":"issue 93145"}]},{"type":"text","text":"。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"DNS 問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大數據裏面很多組件節點都採用 "},{"type":"codeinline","content":[{"type":"text","text":"hostname"}]},{"type":"text","text":" 作爲節點標識,比如"},{"type":"codeinline","content":[{"type":"text","text":"NodeManager"}]},{"type":"text","text":"採用"},{"type":"codeinline","content":[{"type":"text","text":"hostname"}]},{"type":"text","text":"註冊,"},{"type":"codeinline","content":[{"type":"text","text":"Hbase"}]},{"type":"text","text":"組件要支持域名反解,"},{"type":"codeinline","content":[{"type":"text","text":"Kudu"}]},{"type":"text","text":"的master節點依賴自身的域名提前通信。這些都違背了Kubernetes的設計理念,"},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":" 創建容器,CNI分配得到IP,進程啓動OK,容器變成Ready狀態,Pod的Service域名才能通信。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同程大數據選擇用"},{"type":"codeinline","content":[{"type":"text","text":"Host"}]},{"type":"text","text":"網絡部署大部分的存儲組件,沿用宿主機網絡,除了"},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":"集羣子域外再創建一個子域用於組件本身標識,這樣組件遷移會很方便,也不有網絡損耗的煩惱。但是要做好宿主機端口的管理劃分。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"調度問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了提升資源利用率,"},{"type":"codeinline","content":[{"type":"text","text":"雲艙"}]},{"type":"text","text":" 平臺會有很多分時段的部署任務和資源銷燬任務。比如某個"},{"type":"codeinline","content":[{"type":"text","text":"Yarn"}]},{"type":"text","text":"集羣,晚上的時候,對可以混部的資源池打上標籤,在晚高峯的時候儘可能的擴容"},{"type":"codeinline","content":[{"type":"text","text":"NodeManager"}]},{"type":"text","text":"。這個類似於"},{"type":"codeinline","content":[{"type":"text","text":"HPA"}]},{"type":"text","text":",由於業務邏輯的複雜性,同程基於自研 "},{"type":"codeinline","content":[{"type":"text","text":"雲艙Scheduler"}]},{"type":"text","text":" 做到這一點。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"大數據服務基於Kubernetes的架構體系"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從"},{"type":"codeinline","content":[{"type":"text","text":"2019"}]},{"type":"text","text":"年開始轉向 "},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":" 到現在,同程已經建立了一套成熟的大數據服務"},{"type":"codeinline","content":[{"type":"text","text":"PAAS"}]},{"type":"text","text":"體系。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/71\/7c\/71a6f060bd2bef106043cdd9369e547c.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於Kubernetes屏蔽底層的基礎設施,支持多機房多"},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":"集羣的應用部署,除了要考慮各種大數據服務如何遷移上雲,也要考慮整個平臺的易用性,讓組件研發無需登錄機器進行運維和遷移等操作。同程自研了"},{"type":"codeinline","content":[{"type":"text","text":"雲艙"}]},{"type":"text","text":"平臺,主要承擔這一職責。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"考慮到業務研發的接入成本,學習成本,研發"},{"type":"codeinline","content":[{"type":"text","text":"控制檯"}]},{"type":"text","text":"平臺,讓只讀的集羣信息和集羣管理結合起來。改變以前底層信息觸摸不到的情景,讓業務研發也能在平臺層獲取更多的信息,可以對自己的服務做出一些合理的判斷。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"監控收集"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"使用"},{"type":"codeinline","content":[{"type":"text","text":"Thanos"}]},{"type":"text","text":"+"},{"type":"codeinline","content":[{"type":"text","text":"Prometheus Operator"}]},{"type":"text","text":"框架部署收集各個組件集羣的監控,按照以下原則來做到監控的可擴展。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個組件集羣對應一個"},{"type":"codeinline","content":[{"type":"text","text":"Prometheus"}]},{"type":"text","text":"節點"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"每個組件都對應一套獨立的Thanos集羣,"},{"type":"codeinline","content":[{"type":"text","text":"Thanos Query"}]},{"type":"text","text":" 聚合同一組件的所有集羣,"},{"type":"codeinline","content":[{"type":"text","text":"Thanos Rule"}]},{"type":"text","text":" 通過自研的"},{"type":"codeinline","content":[{"type":"text","text":"Sidecar"}]},{"type":"text","text":"同步組件報警規則,部署獨立的"},{"type":"codeinline","content":[{"type":"text","text":"AlterManager"}]},{"type":"text","text":",獨立的"},{"type":"codeinline","content":[{"type":"text","text":"Grafana"}]},{"type":"text","text":"應用。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"每個組件都有一個ceph bucket,將歷史監控數據存儲到"},{"type":"codeinline","content":[{"type":"text","text":"Ceph"}]},{"type":"text","text":"中。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/d9\/e6\/d95b82a4f5ab721b4ca6cd4004a270e6.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"監控域名規則配置如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Prometheus: "},{"type":"codeinline","content":[{"type":"text","text":"\/prometheus\/\/"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Thanos Query:"},{"type":"codeinline","content":[{"type":"text","text":" \/thanos\/"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Thanos Rule:"},{"type":"codeinline","content":[{"type":"text","text":" \/thanos\/rule\/\/alerts"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"AlertManger: "},{"type":"codeinline","content":[{"type":"text","text":" \/thanos\/alert\/\/#\/alerts"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Grafana: "},{"type":"codeinline","content":[{"type":"text","text":" \/grafana\/"}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"集羣服務日誌收集"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"使用"},{"type":"codeinline","content":[{"type":"text","text":"Filebeat"}]},{"type":"text","text":"採集集羣節點的服務日誌,將"},{"type":"codeinline","content":[{"type":"text","text":"Filebeat"}]},{"type":"text","text":"容器和服務容器放在一個"},{"type":"codeinline","content":[{"type":"text","text":"Pod"}]},{"type":"text","text":"中,用富容器的方式來啓動服務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/1e\/9f\/1e5fcedb5e3a52e5a469b3072825a39f.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在"},{"type":"codeinline","content":[{"type":"text","text":"Flink"}]},{"type":"text","text":"計算層做日誌診斷,提供配置規則動態更新,便於更快速發現集羣的故障問題。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"集羣生命週期平臺化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個組件的集羣從申請創建到服務銷燬中間包含很多環節,應該將這些環節程序並平臺化,讓基礎技術能以平臺代碼的形式沉澱下來。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖是用戶申請"},{"type":"codeinline","content":[{"type":"text","text":"Hbase"}]},{"type":"text","text":"集羣服務的工單,用戶在申請的時候只需要填寫少量配置。簡單就是讓業務少思考。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/59\/89\/590c44858e804178214574b719f8d789.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"組件"},{"type":"codeinline","content":[{"type":"text","text":"控制檯"}]},{"type":"text","text":"爲業務研發側提供只讀信息,例如集羣信息、監控、日誌、報警等功能,和組件本身管控平臺相結合,不提供操作或者運維集羣的功能。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/2a\/3e\/2ab2a6bbfyyf3722a9a35f627f1ff23e.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"codeinline","content":[{"type":"text","text":"雲艙平臺"}]},{"type":"text","text":"會爲組件研發提供完善的運維和診斷功能,讓他們無需關心底層基礎設施層。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/72\/81\/7247ab33b6167f3954bbbe427981e381.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"集羣服務化後,計費,報警配置,日誌診斷能功能都能輕鬆的集成起來。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/c5\/8b\/c5fb3d5f6a9d37250cb3baefb38ab98b.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"自研大數據雲原生服務框架"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"codeinline","content":[{"type":"text","text":"雲艙"}]},{"type":"text","text":"平臺將服務分爲單個容器和多個容器,用數量來區分,在此之上用組裝的方式支持多節點類型,一個節點類型對應一個"},{"type":"codeinline","content":[{"type":"text","text":"Group"}]},{"type":"text","text":",這個Group就是一組相同規格的容器。比如Kudu組件就分成兩個"},{"type":"codeinline","content":[{"type":"text","text":"Group"}]},{"type":"text","text":",master和tserver兩個"},{"type":"codeinline","content":[{"type":"text","text":"Group"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"用一個UML圖來簡單描述代碼層結構:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/b3\/3c\/b3caf2f7f04bdd70221fcbdbdc1d9d3c.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對"},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":"集羣的操作會分解成多個"},{"type":"codeinline","content":[{"type":"text","text":"Task"}]},{"type":"text","text":","},{"type":"codeinline","content":[{"type":"text","text":"Task"}]},{"type":"text","text":"之間有依賴關係,組裝成"},{"type":"codeinline","content":[{"type":"text","text":"Job"}]},{"type":"text","text":"發送給"},{"type":"codeinline","content":[{"type":"text","text":"Kafka"}]},{"type":"text","text":",雲艙Scheduler進行消費和處理。比如部署一個"},{"type":"codeinline","content":[{"type":"text","text":"Zookeeper"}]},{"type":"text","text":"集羣,先創建容器,再創建"},{"type":"codeinline","content":[{"type":"text","text":"Service"}]},{"type":"text","text":"負載,配置"},{"type":"codeinline","content":[{"type":"text","text":"DNS"}]},{"type":"text","text":"策略,配置監控,這是一個完整的部署任務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/32\/33\/32a181ede3e9a9bfeb9934d9282dc633.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"現狀"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當前同程將幾乎所有的大數據服務都採用 "},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":" 工具部署和調度,有近"},{"type":"codeinline","content":[{"type":"text","text":"400+"}]},{"type":"text","text":"集羣服務跑在"},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":"上, 一個新的組件集羣可以在15分鐘之內完成交付,極大地減少組件部署消耗的時間。當所有的集羣服務被平臺化管理後,對於機器資源層的調度和利用率提升的需求越來越明顯,同程基於資源監控對組件做混合部署,利用率提升"},{"type":"codeinline","content":[{"type":"text","text":"30%"}]},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大數據底層一般會分爲計算和存儲,但是隨着機器資源越來越多,資源層的研發也是很關鍵的一環。同程希望將數據,資源,算法流程打通,讓數據使用更簡單,讓數據處理更快更穩定。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"業界有很多公司會考慮將大數據計算任務 "},{"type":"codeinline","content":[{"type":"text","text":"native on Kubernetes"}]},{"type":"text","text":",同程也進行調研和嘗試,當前大家都只是解決了部署的問題,任務的完整生命週期還需要研發和測試。所以同程還是着重於 "},{"type":"codeinline","content":[{"type":"text","text":"Yarn"}]},{"type":"text","text":" on "},{"type":"codeinline","content":[{"type":"text","text":"Kubernetes"}]},{"type":"text","text":",一些算法和分析類的"},{"type":"codeinline","content":[{"type":"text","text":"Python"}]},{"type":"text","text":"任務會採用容器調度方式運行。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"未來方向"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同程大數據上雲還有很多問題沒有去優雅的解決,比如已有服務如何平滑的通過平臺的方式遷移上雲,現在還有很多中間過程需要資源研發介入。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"未來的方向主要分爲:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"採用混部和分時調度,提升集羣資源整體利用率。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"用混沌工程的方式提升組件穩定性。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"計算任務 "},{"type":"codeinline","content":[{"type":"text","text":"native on Kubernetes"}]},{"type":"text","text":",提供高優保障。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"持續提升PAAS平臺易用性。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"讓底層資源觸手可及。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"作者簡介"},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"程威,同程旅行數據中心集羣研發部,雲原生愛好者,github ID: "},{"type":"codeinline","content":[{"type":"text","text":"mikechengwei"}]},{"type":"text","text":",TiDB Operator ,Prometheus Operator 核心貢獻者。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章