隱蔽的角落-這次我們只聊Cilium IPAM

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"禮記《學者有四失》裏說“人之學也,或失則多”,這是給我提醒每篇推文最好只聊一個概念。前一篇文章着重介紹了一下Cilium的各種炫酷的花式玩法,今天我們來看一個最最基本的功能:IPAM(IP Address Management)。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"IPAM的概念很簡單,就是IP地址管理。DHCP就是IPAM常用的一種工具。可爲什麼我們要在這裏單獨聊它呢?因爲概念雖然簡單,但在容器網絡這個場景裏,有它特殊的實現方式和業務挑戰。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e5/e50b9349a8dfa6e39cc45295af3a5012.png","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖 1:Cluster內,Pod間通信示意圖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在開始聊IPAM之前,我們先來看看圖1。這是一張示意圖,它畫出了一個cluster內部,Pod內容器間通信,Pod間(跨主機)通信的典型場景。這張圖裏面還畫出了一些額外的信息:","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個Pod內部可能有若干個container,其中一定會包含一個Infra-container。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個container內部是可以同時跑多個process的。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"每個Pod都有一個在容器網絡環境獨一無二的IP地址。容器網絡環境和宿主機網絡環境的區別與聯繫請閱讀二哥之前的推文“鏡子-或許我們也和Pod一樣生活在虛擬世界”。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Pod內部所有的容器都共享其所在Pod的IP地址,但是這些容器需要使用不同的端口。","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們可以看到除了Pod內容器間通信是可以直接用localhost之外,不同的Pod之間通信是需要用到pod IP的。那隨之而來的問題是:pod IP地址是誰給分配的?又是在什麼時候安排上的?如果Pod沒了,這個IP地址被誰收走了?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你肯定覺得在隱祕的角落,應該有一個類似DHCP的東西在控制着這一切,但好像K8s裏面又沒有提到DHCP這個事情。其實答案很簡單:CNI插件。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"CRD","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Cilium用到了一個叫K8s CRD(Custom Resouce Definition)的技術。所謂CRD,就是有一些功能K8s沒有提供,但是呢K8s通過插件的方式外包給第三方。CRD是K8s生態中核心擴展機制(另外一個核心擴展機制是Custom API Server)。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/45/458cad0d42ccb6fd7b20c8ce99ecdb93.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖 2:CRD在K8s apiserver中的位置示意圖(圖片取自書籍《Programming Kubernetes》)","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如圖2所示,在kube-apiserver中有一個模塊叫apiextensions-apiserver來單獨服務CRD。它的位置僅靠着K8s native-resource。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CRD是K8s的一個resource,用來描述對Custom Resouce(CR)的定義,而後者則是基於該CRD而創建的資源。來一句繞口令吧:CRD是CR的定義,CR是CRD的實例。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面的一小段示例用來向K8s註冊CR definition,好讓K8s知道有一個第三方定義的resouce存在。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":"bash"},"content":[{"type":"text","text":"apiVersion: \"apiextensions.k8s.io/v1beta1\"\nkind: CustomResourceDefinition\nmetadata:\n name: ciliumnodes.cilium.io\nspec:\n group: cilium.io\n names:\n kind: CiliumNode\n shortNames:\n - cn\n - ciliumn\n... ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而下面這段另外的示例用來基於前一步註冊的CRD來創建CR。這段示例僅展示了部分內容。K8s收到這樣的請求會創建一個數據結構,填充內容並存放到etcd中,數據的結構從CRD中得到,如spec.group、spec.names、spec.ipam等等,而數據內容其實就是由下面這段聲明式yaml來填充。每一份這樣的數據被叫作CR實例(CR instance)。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":"bash"},"content":[{"type":"text","text":"apiVersion: \"cilium.io/v2\"\nkind: CiliumNode\nmetadata:\n name: \"cilium-2\"\nspec:\n ipam:\n pool:\n 10.0.1.78: {}\n... ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上面這兩段示例也說明了使用CRD的標準方式:先註冊CRD,再創建CR實例。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CR和K8s自帶的resource如Pod,Namespace,Deployment一樣,也是一種resource,只不過它是由第三方自定義的,用於提供和K8s native-resource一樣的使用體驗,如果不注意,你甚至都不會在意這個resource是第三方提供的。爲啥能有這麼自然的使用體驗呢?因爲CRD也是被K8s apiserver一起無差別處理的,CR和native-resource存放的位置都一樣,都是存放在etcd中。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但和K8s自帶的resource位於Core group不同,CR一般位於第三方自己的Group(Group-Version-Resource中的Group)內。具體到Cilium,它定義了若干個CR,其中一個叫","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"CiliumNodes","attrs":{}}],"attrs":{}},{"type":"text","text":",位於Group ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"cilium.io","attrs":{}}],"attrs":{}},{"type":"text","text":",Version ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"v2","attrs":{}}],"attrs":{}},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你可以通過如下命令獲取CiliumNodes這個CRD的詳細內容,如果你的環境恰好使用了Cilium的話。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":"bash"},"content":[{"type":"text","text":"kubectl get crds ciliumnodes.cilium.io -o yaml","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CRD的背後有一個叫K8s Controller的服務以Pod方式在K8s環境裏運行,以響應K8s外包過來的各種請求。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/99/99fe8a735817ba800006d9433738ae6b.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖 3:Customer Controller內部結構圖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如圖3所示,對於一個recource,一旦API server端有針對它的實例創建、刪除或者更新的操作,Informer都會收到\"事件通知\"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"於此同時Controller內部會運行有一個Control Loop,這個loop的作用很明顯,就是消化掉與此resource相關的各個通知事件。","attrs":{}}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"注意區分resource相關的增加、更新、刪除相關的event和top-level Event對象的區別。前者是事件通知機制,而後者則和Pod一樣也是一種resource,可以將其看成是一個反映系統運行狀態的日誌(Log)系統。","attrs":{}}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"IPAM","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"聊完Controller,我們來看看Cilium是如何利用Controller來完成IP地址管理的。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/9b/9ba79a7127ecfbabda3a0b7e0cd2b6fe.png","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖 4:Cilium IPAM示意圖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖4的右方出現了一個","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"cilium-operator","attrs":{}}],"attrs":{}},{"type":"text","text":"的圖示。Operator作爲K8s裏的一個概念,於2016年被CoreDNS提出。它是一個Controller,它的出現大大簡化了“有狀態應用”的部署複雜度,我們大致瞭解到這個地方即可,後續二哥會單獨開一篇聊聊Operator。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"整個K8s cluster只運行有一份cilium-operator Pod,而cilium-agent則在每個Node上都運行一份。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當cilium-operator發現有一個新的","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"CiliumNodes","attrs":{}}],"attrs":{}},{"type":"text","text":"資源被創建後,它會從它的供貨商那裏獲取一批與此Node相關的IP地址塊,Node的如hostname等信息則通過讀取CR資源實例得到。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"前面我們提到每個Node上都會運行有cilium-agent,它會從這個新建CR的","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"spec.ipam.available","attrs":{}}],"attrs":{}},{"type":"text","text":"這裏拿到一個IP地址。這個過程中cilium-operator就像是一個公益超市,批發進貨,散賣加回收,還沒有賺差價。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你也可以執行下面的命令,看看與每個Node相關的IPAM信息。cilium-agent通過client-go即可非常方便地拿到和你看到的一模一樣的數據。二哥爲了你能更快速地得到更直觀的瞭解,貼心地截了一張圖(下圖5)放在這裏。你可以看到這個CR的實例大概長什麼樣子。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":"bash"},"content":[{"type":"text","text":"# 此處的cn是CiliumNodes的縮寫,在CRD的spec.shortNames中定義。\nkubectl get cn -o yaml","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/a4/a434fb46882601cb565092a39a24fa34.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖 5:kubectl get cn cilium-2 -o yaml 輸出截圖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖4僅僅是一個示意圖,實際上Controller的IP供貨商有很多種,除了圖中所示的AWS ENI(Elastic Network Interface)之外,還有Azure,GKE等等可供選擇。從不同的供貨商那裏獲取IP地址塊的方式也不盡相同。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果你的cluster環境是完全自己管理的,我們知道K8s自身就是可以通過kubeadmin來設置CIDR的,像下面的命令一樣,你也可以按照自己的喜好隨意設置。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":"bash"},"content":[{"type":"text","text":"sudo kubeadm init --pod-network-cidr=10.244.0.0/16","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/aa/aac193d44135cf61869e819596f600a1.png","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖 6:Cilium IPAM時序圖","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"說了這麼多,我覺得還是有點抽象,所以我把圖6放上來了。因爲圖4主要涉及到的是AWS ENI,所以我在圖6中將涉及到AWS ENI的部分用紅框標識出來了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖6這張時序圖初看起來比較複雜,一眼望去不知道重點是什麼。因爲它是對圖4的細節放大,所以包含了很多的信息。放大到什麼程度呢?它把從容器創建到IP分配這個過程中所有的參與者和各自參與時機都畫出來了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外從這張圖裏面,我們也可以看到在容器創建過程中,kubelet,CRI,還有CNI plugin之間是如何分工的。可以很明顯地看到CNI插件其實做的事情非常簡單,大部分情況下,它只是個擺設,真正的活還是得靠它背後勤勞的小蜜蜂們來完成,比如這裏的cilium-agent。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以上就是本文的全部內容。碼字不易,更多內容請關注二哥的微信公衆號。您的舉手之勞是對二哥莫大的鼓勵。感謝有你!","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/63/63b236f9f533c70d2e68036f810d0391.jpeg","alt":null,"title":"","style":[{"key":"width","value":"25%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章