基於 Kubesphere 的 Nebula Graph 多雲架構管理實踐

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖數據庫是一種使用圖結構進行語義查詢的數據庫,它使用節點、邊和屬性來表示和存儲數據。圖數據庫的應用領域非常廣泛,在反應事物之間聯繫的計算都可以使用圖數據庫來解決,常用的領域如社交領域裏的好友推薦、金融領域裏的風控管理、零售領域裏的商品實時推薦等等。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"Nebula Graph 簡介與架構","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Nebula Graph 是一個高性能、可線性擴展、開源的分佈式圖數據庫,它採用存儲、計算分離的架構,計算層和存儲層可以根據各自的情況彈性擴容、縮容,這就意味着 Nebula Graph 可以最大化利用雲原生技術實現彈性擴展、成本控制,能夠容納千億個頂點和萬億條邊,並提供毫秒級查詢延時的圖數據庫解決方案。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d8/d8bb65989f85437eba4df9487cfe9889.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Nebula Graph 架構圖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖所示爲 Nebula Graph 的架構,一個 Nebula 集羣包含三個核心服務,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Graph Service、Meta Service 和 Storage Service","attrs":{}},{"type":"text","text":"。每個服務由若干個副本組成,這些副本會根據調度策略均勻地分佈在部署節點上。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Graph Service 對應的進程是 nebula-graphd,它由無狀態無關聯的計算節點組成,計算節點之間互不通信。Graph Service 的主要功能,是解析客戶端發送 nGQL 文本,通過詞法解析Lexer 和語法解析 Parser 生成執行計劃,並通過優化後將執行計劃交由執行引擎,執行引擎通過 Meta Service 獲取圖點和邊的 schema,並通過存儲引擎層獲取點和邊的數據。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Meta Service 對應的進程是 nebula-metad ,它基於 Raft 協議實現分佈式集羣,leader 由集羣中所有 Meta Service 節點選出,然後對外提供服務,followers 處於待命狀態並從 leader 複製更新的數據。一旦 leader 節點 down 掉,會再選舉其中一個 follower 成爲新的 leader。Meta Service 不僅負責存儲和提供圖數據的 meta 信息,如 Space、Schema、Partition、Tag 和 Edge 的屬性的各字段的類型等,還同時負責指揮數據遷移及 leader 的變更等運維操作。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Storage Service 對應的進程是 nebula-storaged,採用 shared-nothing 的分佈式架構設計,每個存儲節點都有多個本地 KV 存儲實例作爲物理存儲其核心,Nebula 採用 Raft 來保證這些KV 存儲之間的一致性。目前支持的主要存儲引擎爲 Rocksdb 和 HBase。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Nebula Graph 提供C++、Java、Golang、Python、Rust 等多種語言的客戶端,與服務器之間的通信方式爲 RPC,採用的通信協議爲 Facebook-Thrift。用戶也可通過 nebula-console、nebula-studio 實現對 Nebula Graph 操作。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"多雲架構挑戰","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Nebula Graph 的雲產品定位是 DBaaS (Database-as-a-Service)平臺,因此肯定要藉助雲原生技術來達成這一目標。到底該如何落地呢?首先要明確一點,任何技術都不是銀彈,只有合適的場景使用合適的技術。雖然我們擁有很多可供挑選的開源產品來搭建這個平臺,但是最終落實到交付給用戶的產品上,還有很多挑戰。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏我列舉了三個方面的挑戰:","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"業務挑戰","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"多個雲廠商的資源適配,這裏需要實現統一的資源抽象模型,同時還要做好國際化,國際化需要考慮地域文化差異、當地法律法規差異、用戶消費習慣差異等多個要素,這些要素決定了需要在設計模式上去迎合當地用戶的使用習慣,從而提升用戶體驗。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"性能挑戰","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在大多數情況下,通過同一雲廠商網絡傳輸的數據移動速度比必須通過全球互聯網從一個雲廠商傳輸到另一個雲廠商的數據移動速度要快得多。這意味着跨雲之間的網絡連接可能成爲多雲體系結構的嚴重性能瓶頸。數據孤島很難打破,因爲企業無法遷移格式不同且駐留在不同技術中的數據,缺乏可遷移性會帶給多雲戰略帶來潛在的風險。在單個雲廠商中,使用雲廠商的原生自動擴展工具配置工作負載的自動擴展非常容易,當用戶的工作負載跨越多個雲廠商時,自動擴展就會變得棘手。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"運營挑戰","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大規模的 Kubernetes 集羣運營是非常有挑戰的事情,滿足業務的快速發展和用戶需求也是對團隊極大的考驗。首先是做到集羣的管理標準化、可視化,其次全部的運維操作流程化,這需要有一個深入瞭解運維痛點的管理平臺,可以解決我們大部分的運維需求。數據安全上需要考慮在沒有適當的治理和安全控制的情況下,將數據從一個平臺遷移到另一個平臺(或從一個區域遷移到另一個區域)會帶來數據安全風險。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"DBaaS(Database-as-a-Service)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"雲原生技術簡單概括就是爲用戶提供一種簡易的、敏捷的、可彈性擴展的、可複製的方式,最大化使用雲上資源的能力。","attrs":{}},{"type":"text","text":"雲原生技術不斷演進也是爲了用戶更好的專注於業務開發。大家可以看到這個金字塔,從 IaaS 到最上面的雲原生應用層,產品形態越來越靈活,計算單元的粒度越來越細,模塊化程度、自動化運維程度、彈性效率、故障恢復能力都是越來越高,這說明每往上走一層,應用與底層物理基礎設施解耦就越徹底,用戶的關注點不再是從硬件服務器到業務實現整個鏈條,而是僅需要關注於當下業務本身。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ba/badc5f30dc808ab1c3cfab110b2ca42e.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"PaaS 平臺的容器編排系統是 Kubernetes,自然而然地就能想到基於 Kubernetes 構建這套平臺,Kubernetes 提供了容器運行時接口,你可以選擇任意一種實現這套接口的運行時來構建應用運行的基礎環境。因此,利用好 Kubernetes 提供的能力,就能達到事半功倍的效果。Kubernetes 提供了從命令行終端 kubectl 到容器運行依賴的存儲、網絡、計算的多個擴展點,用戶可以根據業務場景實現一些自定義擴展插件對接到 kubernetes 平臺,而不用擔心侵入性。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"用戶視圖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"NebulaCloud 目前爲用戶提供兩種訪問方式,一種是通過瀏覽器進入 Studio 操作窗口,在數據導入後可以做圖探索,nGQL 語句執行等操作,另一種是通過廠商提供的 private-link 打通用戶到 NebulaCloud 之間的網絡連接,用戶可以通過 nebula-console 或者 nebula client 直連到 Nebula 實例。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/69/6984817229b4bc82ca6f726b515fee47.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"NebulaCloud 架構","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從業務架構上看,NebulaCloud 可以分爲三層,最底層是資源適配層,主要負責提供資源層面上的適配,提供對多雲廠商、多地域集羣、同構或異構資源池的抽象描述。再往上是業務層與資源層,業務層涵蓋基礎服務、實例管理、租戶管理、計費管理、數據導入管理等業務模塊;資源層負責提供 Nebula 集羣的運行環境,在調度策略下提供最佳的資源配置。最上層是網關層,對外提供訪問服務。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/65/65e397c09d9b9247b3b78428d38c21d5.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Nebula Graph 架構圖","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"NebulaCloud 內部流程","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏以 AWS 爲例策略地描述一個 Nebula 集羣的創建過程。用戶創建實例請求提交後,nebula-platform 服務根據輸入的廠商、地域、規格等參數信息做資源調度,比如資源池、負載均衡、安全策略等配置,然後通過 nebula-operator 的 api 完成實例的創建,最後配置 ALB 規則,爲用戶提供訪問實例的入口。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/6d/6df030ea7d5708aaa68f93e5220cdd9e.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":2}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Nebula-Operator","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 Kubernetes 中,定義一個新對象可以有兩種方式,一個是 CustomResourceDefinition, 一個是 Aggregation ApiServer,其中 CRD 是目前主流的做法,nebula-operator 就是 CRD 來實現的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CRD+Custom Controller 就是典型的 Operator 模式了。通過向 Kubernetes 系統註冊好的 CRD,我們可以使用 controller 來觀察 Nebula 集羣以及與它相關聯的資源對象狀態,然後按照寫的協調邏輯來驅動 Nebula 集羣向期望狀態轉移。這麼實現可以把 Nebula 相關的管理工作都沉澱到 Operator 裏,用戶使用 NebulaGraph 的複雜度降低,可以輕鬆完成彈性擴縮容、滾動升級等核心操作。我們基於 kubernetes 的 Restful API 生成了一套管理 Nebula 集羣的 API,這樣用戶可以拿着 API 就能實現對接自己的 PaaS 平臺,搭建自己的圖計算平臺。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"nebula-operator 目前的功能還在不斷完善中,實例的滾動升級需要 Nebula 提供底層支持,預計今年會支持上。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e8/e8fa0705e10e56c7d73bec1a2cf6d4e8.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"KubeSphere 多集羣管理","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"平臺化管理","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"KubeSphere 衍生自青雲公有云的操作面板,除了繼承顏值,同時在功能上也是相當完備。NebulaCloud 需要對接的主流雲廠商都已經支持上,因此一套管理平臺就可以運維所有的 Kubernetes 集羣。多集羣管理是我們最爲看重的功能點。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們在本地環境部署了 Host 集羣,其餘的雲上託管 Kubernetes 集羣通過直連接入的方式作爲 Member 集羣,這裏需要注意 ApiServer 訪問配置放通單個 IP,比如本地環境的出口公網 IP。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/49/494edb816cae8d9d955369a4eaf64d18.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/82/829cc8693f15006b3001a3a2632e7aa8.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":2}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"流程化操作","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們使用 IaC 工具 pulumi 部署新集羣,再通過自動化腳本工具設置待管理集羣 member 角色,全部過程無需人工操作。集羣的創建由平臺的告警模塊來觸發,當單集羣的資源配額達到告警水位後,會自動觸發彈性出一套新的集羣。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/7e/7e9e3eba4bb29a2d47f7423980546093.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":2}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"自動化監控","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"KubeSphere 提供了豐富的內置告警策略,同時還支持自定義告警策略,內置的告警策略基本可以覆蓋日常所需的監控指標。在告警方式上也有多種選擇,我們採用了郵件與釘釘相結合的方式,重要緊急的可以通過釘釘直接釘到值班人員,普通級別的可以走郵件方式。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/50/5067806659789536b920ed847e916fe9.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/fa/fae4daab1556aa4c9afe32d9ca0317c1.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":2}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"智能化運營","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"KubeSphere 提供了集羣多個維度的全局展示視圖,目前管理的集羣數量少足夠使用。未來隨着接入 member 集羣數量的增加,可以通過運營數據的分析做資源的精細化調度和故障預測,進一步提前發現風險,提升運營的質量。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/34/34a1a8f97c3b7cb30c661d5e482576a3.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/27/27a5b2a72eb4922c5e323a2e1a0c294c.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":2}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"其他","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"KubeSphere 還有很多好用的配套工具,比如日誌查詢、事件查詢、操作審計日誌等,這些工具在精細化運營都是必不可少的。 我們目前已經接入了測試環境集羣,在深度使用掌握 KubeSphere 的全貌後會嘗試接入生產集羣。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"未來規劃","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們將充分挖掘自定義告警策略並加以利用,同時結合 Nebula 集羣自身的監控指標打造監控全景圖;覆蓋核心指標的多級、多維度的告警機制,將風險消滅在源頭;完善周邊配套工具,通過主動、被動以及流程化等減少誤操作風險;啓用 DevOps 工作流,打通開發、測試、預發、生產環境,減少人力介入。","attrs":{}}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章