騰訊雲中間件團隊在Service Mesh中的實踐與探索

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Service Mesh 作爲騰訊微服務平臺(TSF)支持的微服務架構之一,產品化命名爲 Mesh 微服務平臺(Tencent Service Mesh Framework,簡稱 TSF Mesh),提供下一代微服務架構 - 服務網格(Service Mesh)的解決方案,覆蓋公有云、私有云和本地化部署等多種場景。從 2018 年 8 月推出首個版本以來,已經陸續在金融、新零售、工業互聯網,以及公司內部等生產環境落地。在產品落地過程中,遇到了一系列技術挑戰,如非 Kubernetes 環境的支持、多租戶隔離、與 Spring Cloud 服務框架的互通、海量服務實例下的域名解析等等。針對這些問題,通過自研以及社區合作,最終得以解決。本文主要從用戶場景出發,以生產實踐探索過程中遇到的挑戰爲切入點,梳理和總結應對的解決方案,以期望對 Service Mesh 的認識、對 TSF Mesh 產品的瞭解有所幫助。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"背景介紹"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":"br"}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"什麼是 Service Mesh ? 根據 Buoyant CEO,Service Mesh 理念的提出者和先行者,William Morgan定義,Service Mesh(服務網格)是一個專注於處理服務間通信的基礎設施層。用於解決服務間複雜拓撲中的可靠請求傳遞,是雲原生技術棧的關鍵組件之一。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2018 年被很多人稱爲 “Service Mesh 之年”,這一年,業界幾乎所有大廠都在嘗試推出自己的 Service Mesh 產品。Service Mesh 中的明星項目 Istio 在這一年也是蓄勢待發,作爲 Google、IBM、Lyft 聯合開發的開源項目,陸續發佈了0.5、0.6、0.7、0.8 和 1.0 版本,到 2018 年 7 月 31日1.0 GA 時,對 Istio 來說是一個重要的里程碑,官方宣稱所有的核心功能都可以用於生產。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以 GitHub 上的 star 數量的角度來看一下 Istio 在近幾年的受歡迎程度,下圖顯示的是 Istio 的 GitHub star 數量隨時間變化曲線。可以看到在 2018 年,Istio 的 star 數量大概增長了一萬,目前已經超過 2.2萬 顆星,其增長趨勢也非常平穩。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/3b\/3b14693f689bd52460894adbf56d6906.png","alt":null,"title":null,"style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"早在 2017 年,騰訊雲中間件團隊就選定 Istio 爲技術路線,開始 Service Mesh 的相關預研和研發工作。作爲騰訊微服務平臺(TSF)的無侵入式微服務框架的核心實現,於 18 年初在騰訊廣告平臺投入,打磨穩定後,陸續開始對外輸出,目前在金融、工業互聯網等領域都有落地案例。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"產品落地過程並非一帆風順,會遇到一些問題和挑戰。接下來,首先以開源 Istio 爲切入點,介紹一下 TSF Mesh,之後對 TSF Mesh 產品化探索過程中的部分典型問題以及解決方案進行梳理和分享。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"TSF Mesh介紹"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":"br"}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Mesh 微服務平臺(Tencent Service Mesh Framework,簡稱 TSF Mesh),基於 Service Mesh 的理念,爲應用提供服務自動註冊與發現、服務路由、鑑權、限流、熔斷等服務治理能力,且應用無需對源代碼進行侵入式改造,即可與該服務框架進行集成。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在開發選型上,基於業界達到商用標準的開源軟件 Istio 進行構建,主要原因如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Istio 功能相對完備,mesh 該有的能力都有。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"社區活躍,資源豐富,CNCF 成員,代表雲原生標準化。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Golang(Istio)& C++14(envoy)都是高性能語言,且運行起來資源使用靈活,獨立性好,無 JVM 等外部依賴。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在瞭解 TSF Mesh 架構之前,先回顧一下 Istio 的架構圖,如下圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/3c\/3c4d2a24ec765bcef693a20187e6156e.png","alt":null,"title":null,"style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上面的架構圖中,Istio Mesh 分爲兩塊:數據面板和控制面板。envoy 在 Istio 中扮演數據面板的角色,作爲服務的代理,被部署爲 sidecar,服務無需感知 envoy 的存在;控制面板包含Pilot,Mixer,Citadel等組件。這些組件的主要功能如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Envoy: 作爲底層的代理,通常選用其擴展版本 istio-proxy,用於調度服務網格中所有服務的出入站流量。包含了豐富的內置功能,例如動態服務發現,負載均衡,HTTP\/2&gRPC 代理,熔斷器,健康檢查,基於百分比流量拆分的灰度發佈,故障注入,性能指標等。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pilot: 控制面的核心組件,爲 Envoy 提供服務發現、智能路由(如 AB 測試、灰度發佈)和彈性流量管理功能(如超時、重試、熔斷),負責將高層的抽象的路由規則轉化成低級的 envoy 的配置。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Mixer: 提供策略檢查和遙測功能。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Citadel: 安全組件,提供了自動生成、分發、輪換與撤銷密鑰和證書功能。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"TSF Mesh 整體架構上,其核心能力與開源的 Istio 保持一致,同時對 envoy、Pilot、Mixer、Pilot-agent 組件做了增強,並且新增組件 Apiserver 和 Mesh-dns。外圍能力聚焦在安全性、易用性、可維護性和可觀測性,如下圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/27\/272108e0467852ba73c0ddc8b49d662a.png","alt":null,"title":null,"style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"運營支撐提供了運營端管理和租戶端管理,比如租戶端的角色管理,集羣管理,命名空間管理,應用管理,服務治理等;運營端提供資源管理等。監控系統提供了日誌功能,鏈路追蹤,調用鏈拓撲圖,指標監控等。基礎組件爲限流、註冊中心、配置中心、日誌採集和實時監控提供支撐。Paas爲應用部署提供支撐,比如aPaas等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"TSF Mesh 保留 Istio 所有的原生特性,同時對 Service Mesh 疊加了部分商業特性,如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"平臺解耦:支持K8S\/VM\/裸金屬服務器環境"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"新舊兼容:支持 Spring Cloud 應用、Service Mesh 應用互通,統一治理"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"多租戶隔離、管理支持"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"調用鏈、日誌、監控落盤"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"高可用性"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"TSF Mesh產品化挑戰"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":"br"}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"1. 支持異構的計算平臺"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"儘管Istio強調自身可擴展性的重要性在於適配各種不同的平臺,也可以對接其他服務發現機制,但在實際場景中,通過深入分析 Istio 幾個版本的代碼和設計,可以發現其重要的能力都是基於 Kubernetes 進行構建的。以下面兩點爲例:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"服務配置管理"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Istio 的所有路由規則和控制策略都是通過 Kubernetes CRD 實現,可以說 Istio 的 APIServer 就是 Kubernetes 的 APIServer,數據也自然地被存在了對應 Kubernetes 的 Etcd 中。如下圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ac\/ac74f5674886ea8f9b7babfccf05a4fd.png","alt":null,"title":null,"style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"服務發現及健康檢查"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Istio 的服務發現機制基於 Kubernetes 的 Services\/Endpoints,從 Kube-apiserver 中獲取 Service 和 Endpoint,然後將其轉換成 Istio 服務模型的 Service 和 ServiceInstance。同時 Istio 不提供域名解析能力,域名訪問機制也依賴於 kube-dns, coreDNS 等構建。節點健康檢查能力基於LivenessProbe\/ReadinessProbe 機制實現 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在實際場景中,TSF 的用戶並非都是 Kubernetes 用戶,例如公司內部的一個業務因歷史遺留問題,不能完全容器化改造,同時存在 VM 和容器環境,場景如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/1a\/1a239f98d61a56a1260ed2cda1dd927c.png","alt":null,"title":null,"style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從上面的業務場景可以看出,業務要求能夠將其部署在自研 Paas 以及 Kubernetes 的容器、虛擬機以及裸金屬的服務都可以通過 Service Mesh 進行相互訪問。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了實現多平臺的部署,必須與 Kubernetes 進行解耦。在脫離 Kubernetes 後,Istio 面臨以下四個問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"服務的動態配置管理不可用"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"服務節點健康檢查不可用"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"服務自動註冊與反註冊能力不可用"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"流量劫持不可用"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對這 4 個問題,TSF Mesh 團隊對 Istio 的能力進行了擴展和增強,增強後的架構如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a4\/a410be682a4562103d5ad0c716265748.png","alt":null,"title":null,"style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"增強 Pilot 的 consul 適配器,與 consul 註冊中心對接;增加 Apiserver 實現元數據轉換;增強 Pilot-agent,實現VM的自動注入,服務註冊,envoy 管理。經過改造後,Service Mesh 成功與 Kubernetes 平臺解耦,組網變得更加簡潔,通過 GRPC 和 REST API 可以對數據面進行全方位的控制,可從容適配任何的底層部署環境,對於私有云客戶可以提供更好的體驗。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"2. 支持多租戶"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"租戶的概念不止侷限於集羣的用戶,它可以包含爲一組計算,網絡,存儲等資源組成的工作負載集合。而在多租戶場景中,需要對不同的租戶提供儘可能的安全隔離,以最大程度的避免惡意租戶對其他租戶的攻擊,同時需要保證租戶之間公平地分配共享集羣資源。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在公有云或私有云場景下,用戶對隱私和隔離看得非常重要。往往不同用戶\/租戶之間,服務配置、節點信息、控制信息等資源數據是隔離的,互相不可見。但是 Istio 本身並不支持這種級別的隔,需要框架集成者去擴展。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Istio 依託於 kubernetes 能力,可實現 “soft-multitenancy”,即單一 Kubernetes 控制平面和多個 Istio 控制平面以及多個服務網格相結合;每個租戶都有自己的一個控制平面和一個服務網格。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其它租戶模式,比如單獨的 Istio 控制平面控制多集羣網格的場景,Istio 並不支持。在這種場景下,每個租戶一個網格,集羣管理員控制和監控整個 Istio 控制面以及所有網格,租戶管理員只能控制特定的網格。這種場景與雲環境下的多租戶概念比較穩合,對此 TSF Mesh 通過數據建模,實現了這種租戶模式,即單控制面多集羣網格。基本架構如下圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/60\/6008d4bcf54bf0ad3bc5a96c0f3208ed.png","alt":null,"title":null,"style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上圖中,實現了租戶管理、租戶數據的隔離存儲、Pilot控制面緩存增加租戶索引。在這種場景下,各租戶只能看到自身的集羣資源,包括計算資源、邏輯資源、應用資源等,其它租戶創建的集羣資源不可見,sidecar 只能從控制端同步到本租戶的配置和服務 xDS 信息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"3. 服務尋址"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在侵入式框架下,目標服務的標識通常是服務名,服務註冊與發現是強關聯的,通過服務發現機制實現服務名到服務實例的尋址。在 Service Mesh 機制下,對應用是無侵入的,服務發現機制只能下沉到 Service Mesh,這意味着客戶端通過目標服務標識名稱的訪問方式,需要域名解析能力的支持。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Istio 下的應用使用完全限定域名 FQDN(fully qualified domain name)進行相互調用,基於 FQDN 的尋址依賴 DNS 服務器,Istio 官方對 DNS 服務器的說明如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"Istio does not provide a DNS. Applications can try to resolve the FQDN using the DNS service present in the underlying platform (kube-dns, mesos-dns, etc.)."}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從上面的描述看出,Istio並不提供 DNS 的能力,依託於平臺的能力,如 kubernetes 平臺下的 kube-dns 。以 Istio 的官方提供的demo:bookinfo 爲例,Reviews 與 Ratings 之間的完整的服務調用會經過以下過程:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/b5\/b5e1291bfc7f0f52f4760edfe0cc643e.png","alt":null,"title":null,"style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從圖上可以看出,Reviews 和 Ratings 的互通,kube-dns 主要實現 2 個功能:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"服務的 DNS 請求被 kube-dns 接管"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"kube-dns 將服務名解析成可被 iptables 接管的虛擬 IP(clusterIP)"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"正如前面提到的,用戶的生產環境不一定包含 kubernetes 或者 kube-dns,我們需要另外尋找一種機制來實現上面的兩個功能。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 DNS 選型上,有集中式和分佈式兩種方案,集中式 DNS:代表有 kube-dns, CoreDNS 等,通過內置或者插件的方式,實現與服務註冊中心進行數據同步。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"集中式 DNS 存在以下問題:組網中額外增加一套 DNS 集羣,並且一旦 DNS Server 集羣不可用,所有數據面節點在 DNS 緩存失效後都無法工作,因此需要爲 DNS Server 考慮高可用甚至容災等一系列後續需求,會導致後期運維成本增加。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分佈式 DNS 將服務 DNS 的能力下沉到數據平面。分佈式 DNS 運行在數據面節點上,DNS 無單點故障,無需考慮集羣容災等問題,只需要有機制可以重新拉起即可。由於與業務進程運行在同一節點,因此其資源佔用率必須控制得足夠低,纔不會對業務進程產生影響。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"綜合考慮,最終 TSF Mesh 選用了分佈式 DNS 的方案,以獨立進程作爲 DNS Server,如下圖所示"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/8e\/8ede3f50db45b8af82511ea81ff90e0e.png","alt":null,"title":null,"style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖中 Mesh-dns 通過 Pilot 同步服務信息,當應用通過服務名調用時,會進入 Mesh-dns 進行域名的本地解析,然後流量被 iptables 接管,之後到達 envoy,最後由 envoy 動態路由到 upstream;對於其它非 mesh 服務的域名解析,Mesh-dns 會透明傳輸,走默認的 DNS。通過配置緩存本地化以及異常退出後自動拉起並加載配置,保證在異常情況下的高可用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"值得一提的是 Istio 暴力流量接管問題,這個也是大家詬病比較多的。由於 Istio 的數據面針對 kubernetes 容器內流量進行全接管,但是對於虛擬機或裸金屬場景可能不適用,畢竟虛擬機或裸金屬上可能不僅僅只有 mesh 的服務。因此,需要考慮細粒度的接管方案,使得 mesh 與非 mesh 應用在同一個虛擬機\/容器中可以共存。TSF Mesh 對這塊能力也做了增強,只需要少量的 iptables 規則,即可完成 mesh 與非 mesh 流量的篩選。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"4. 與異構服務框架互通"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"微服務框架可以分爲侵入式和非侵入式兩種,目前比較主流的微服務框架 Spring Cloud,基於Spring Boot開發,提供一套完整的微服務解決方案,包括服務註冊與發現,配置中心,全鏈路監控,API網關,熔斷器,遠程調用等開源組件,並且可以根據需求對部分組件進行擴展和替換。與Service Mesh之處不同在於,Spring Cloud是一種侵入式的微服務框架,需要SDK支撐,並且技術棧受限於Java。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"出於功能重疊、語言壁壘 、耦合性,開發運維成本,技術門檻,雲原生生態等多方面的因素,有相當一部分用戶開始嘗試 Service Mesh,或者往 Service Mesh 遷移和轉型,但仍然存在一些遺留的 Spring Cloud 的服務,希望能與 Service Mesh 中的服務互通。用戶期望支持的架構如下圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/44\/4442b4b2fa625ed31a88630578aa2fdd.png","alt":null,"title":null,"style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上面這個架構中,最大的挑戰在於涉及了兩個不同的微服務框架之間的互通。這兩個微服務框架從架構模式、概念模型、功能邏輯,技術棧上,都存在較大的差異。唯一相共的點,就是他們都是微服務框架,可以將應用的能力通過服務的形式提供出來,給外部用戶調用,外部用戶實際上並不感知服務的具體形態。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於這個共同點,爲了使得不同框架下的服務能夠正常工作,TSF 團隊做了大量的開發工作,將兩個微服務框架,從部署模式、服務及功能模型上進行了拉通,主要包括如下幾點:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"
對齊能力說明
服務模型基於統一的服務元數據模型,針對 Service Mes和Spring Cloud 的服務註冊發現機制進行拉通
服務 API基於標準 API 模型(OpenAPI v3),針對兩邊框架的 API 級別服務治理能力進行拉通
服務路由基於標準權重算法以及標籤模型,針對 Service Mesh virtual-service 和 Spring Cloud ribbon 能力進行拉通。
限流基於標準令牌桶架構和模型,以及條件匹配規則,對 mixer 及 spring cloud ratelimiter 能力進行拉通。
熔斷器增強 envoy 的能力,實現業界標準熔斷器,支持服務級別、API級別和實例級別,與 Spring Cloud 拉通
鑑權基於標籤模型,支持條件匹配規則的黑白名單規則,與 Spring Cloud 拉通"}}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"5. 可觀測性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在上一小節,提到了 Service Mesh 與 Spring Cloud 的能力互通,TSF Mesh 爲了提供更好的用戶體驗,在日誌、監控和調用鏈方面的能力也與 Spring Cloud 拉通,在 envoy 標準 Tracers 能力(envoy.zipkin)的基礎上,增加了envoy.local 類型,使其支持監控和調用鏈日誌落到本地掛載盤,由 TSF 的 APM 系統採集並分析,實現 mesh 應用與 spring 應用的調用鏈串接。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如下圖展示了兩種不同微服務架構下,一致的服務依賴拓撲能力。user、shop、promotion爲 Service Mesh 應用,provider-demo 爲 Spring Cloud 應用,服務間的箭頭表示了調用關係。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/98\/98087005d998d69a08c252975bd3d7d3.png","alt":null,"title":null,"style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結與展望"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":"br"}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"TSF Mesh 作爲騰訊微服務平臺 TSF 的 Service Mesh 解決方案,在持續交付中,幫助企業客戶解決傳統集中式架構轉型的困難,打造大規模高可用的分佈式系統架構,實現業務、產品的快速落地。本文主要從用戶實際場景出發,挑選了 TSF Mesh 產品化過程中遇到的部分典型問題和應對的解決方案,進行梳理和介紹,希望對 TSF Mesh 產品的瞭解以及技術演進思路有所幫助。還有一些問題和解決辦法,涉及較深的技術細節,或顯枯燥,並未一一羅列,比如性能優化相關,mixer 相關,自定義協議相關,部署相關等等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"TSF Mesh 團隊擁抱開源協同,努力跟進 Service Mesh 的技術發展趨勢,積極參與社區貢獻。就技術發展趨勢,有些點仍值得後續探討,比如控制面單體化,UDPA(通用數據平面API)的標準化演進,wasm 在 envoy 中扮演的角色,mixer 下沉,協議擴展,性能優化等等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"回顧過去,從 \"Service Mesh\" 和 \"Istio\" 這兩個詞彙第一次進入公衆視野到如今,有將近四年的時間,見證了數據面板的爭奇鬥豔,也親歷了 xDS 的“快速”演變,架構與性能之間的妥協也從未停歇。總之,一句話:流年笑擲,未來可期。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"horizontalrule"},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"頭圖:Unsplash"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作者:呂曉明"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文:https:\/\/mp.weixin.qq.com\/s\/20UJMs4U5YEUfxV6dS3oJg"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文:騰訊雲中間件團隊在Service Mesh中的實踐與探索"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"來源:騰訊雲中間件 - 微信公衆號 [ID:gh_6ea1bc2dd5fd]"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"轉載:著作權歸作者所有。商業轉載請聯繫作者獲得授權,非商業轉載請註明出處。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章