科技創投媒體36Kr的容器化之路

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"背景","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"36Kr是一家創立於2010年,專注於科技創投領域的媒體公司,業務場景並不複雜,前端主要使用NodeJS進行Render,移動端有Android也有iOS,後端服務幾乎全都由PHP來支持。使用PHP的主要原因是在最初進行技術選型的時候發現,PHP進行Web開發效率比較高,後來就一直這樣延續下來了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是在後期,隨着業務的突飛猛漲,在程序設計中又沒能進行解耦,就導致了許多服務耦合成了一個很臃腫的單體應用,邏輯耦合嚴重,進而導致了很多的性能問題,隨着問題越來越難改,開發任務又越來越緊,就不得不往後拖,越往後拖留下的問題就更難改,形成了一個惡性循環,留下了很多的技術債,很不利於後續的開發任務,並且一旦出現了問題,也很難追溯具體原因,所以在那時候經常聽到一句話  “這是歷史遺留問題” 。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"B/S、C/S、單體應用,這是一種很傳統 也很簡單的架構,但是缺點也暴露無遺,所以經常因爲一個業務邏輯的性能問題,進而影響到所有的業務。在運維側,運維只能夠通過堆機器,升配置等策略來應對,投入了很多的機器成本和人力成本,但是收效甚微,很是被動。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這種情況已經是迫在眉睫了,終於技術團隊決定使用 Java 語言進行重構,將單體應用進行微服務化拆解,徹底改變這種因爲單體應用故障而導致生產環境出現大範圍的故障。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"需求分析 + 選型","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在重構計劃開始一段時間後,爲了節省虛機資源,我們一臺虛機上運行了多個 Java 程序,但是因爲沒有資源隔離和靈活的調度系統,其實也會導致一些資源的浪費。並且在高併發場景下,偶爾會有資源搶佔導致一個應用影響另一個應用的情況。爲此,我們運維專門開發了一套自動化部署系統,系統內包括部署、監控檢測、部署失敗回滾、重啓等基礎功能。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隨着當時 K8s 的風靡,還有 Rancher 2.x 的發佈,我們逐漸發現,我們所面臨的這些問題,它們基本都能解決,比如資源隔離、deployment 的控制器模型、靈活的調度系統,這些都有,這就是最好的自動化部署系統啊,於是我們運維側,也開始決定向容器化進軍。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在選型上,因爲我們的服務基本都在阿里雲上面,所以第一個想到的是阿里雲。時因爲我們和華爲有一些業務的往來,所以華爲的 CCE 也作爲了備選,但是考慮到我們的服務資源全部在阿里雲上,這個遷移成本實在太大了,所以就沒再考慮華爲雲。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們一開始使用過Rancher 1.6,但是隻是用來管理主機上部署的原生 Docker。也因此對Rancher的產品產生了很大的好感。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"需求方面,因爲要降低我們研發人員的學習成本,容器管理平臺的易用性十分重要。此外,K8s 的基礎功能是必須的,因爲 K8s 還在高速發展階段,所以能需要夠隨時跟上更新,有安全漏洞後也需要第一時間進行更新打補丁,同時還要有基本的權限控制。而且我們公司內部沒有專門的K8S團隊,運維人員也只有2位,所以如果能夠有專業人員進行技術上的交流,發生了問題可以有專業的服務團隊來協助也十分重要。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"綜上,基本上就是 Rancher 完勝,UI 做得非常友好,開發人員能夠很快上手,更新迭代速度也非常快,發現漏洞後也會有詳細的補丁方案,認證策略也完美支持我們的 OpenLDAP 協議,能夠對開發、測試、運維人員進行不同權限控制,並且也是第一家做到支持多雲環境的,方便以後我們做跨雲的方案。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們這次容器化的過程,主要經歷了以下幾個因素的考慮,今天我就來和大家分享我們在 Rancher 上的一些實踐,希望能給大家帶來幫助:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"應用的容器化改造","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Rancher 的高可用性","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"容器的運維","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"多租戶隔離","attrs":{}}]}],"attrs":{}}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"應用的容器化改造","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲我們的開發人員,有相當一部分是沒有接觸過容器的,爲了能對開發人員更友好一些,我們的鏡像分成了兩層,主要的 Dockerfile 編寫是由我們運維人員來編寫的,而開發人員代碼倉庫裏的 Dockerfile 是最簡單的,基本上只有代碼拷貝的過程和一些必傳的變量,具體可以參考以下示例:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"\n## 這是運維人員維護的 Dockerfile 示例\n## 本示例僅做參考\nFROM alpine:3.8\nMAINTAINER yunwei \nWORKDIR /www\nRUN mv /etc/apk/repositories /etc/apk/repositories.bak \\\n && echo \"http://mirrors.aliyun.com/alpine/v3.8/main/\" >> /etc/apk/repositories \\\n && apk update && apk upgrade\nRUN apk --no-cache add ca-certificates wget && \\\n wget -q -O /etc/apk/keys/sgerrand.rsa.pub https://alpine-pkgs.sgerrand.com/sgerrand.rsa.pub && \\\n wget https://github.com/sgerrand/alpine-pkg-glibc/releases/download/2.29-r0/glibc-2.29-r0.apk && \\\n apk add glibc-2.29-r0.apk && rm -f glibc-2.29-r0.apk\nRUN apk add -U --no-cache \\\n bash \\\n sudo \\\n tzdata \\\n drill \\\n iputils \\\n curl \\\n busybox-extras \\\n && rm -rf /var/cache/apk/* \\\n && ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime\nCOPY java-jar/jdk1.8.0_131 /usr/local/jdk1.8.0_131\nENV TZ=\"Asia/Shanghai\"\nENV JAVA_HOME=/usr/local/jdk1.8.0_131\nENV CLASSPATH=$JAVA_HOME/bin\nENV PATH=.:$JAVA_HOME/bin:$PATH\nENV JAVA_OPTS=\"-server -Xms1024m -Xmx1024m\"\nCMD java -jar $JAVA_OPTS -Dserver.port=8080 server.jar\n\n=======================================\n\n## 這是開發人員維護的 Dockerfile 的示例\nFROM harbor.36kr.com/java:v1.1.1\nMAINTAINER developer \nADD web.jar ./server.jar\n","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以看到,開發人員所維護的 Dockerfile 可以說相當簡單了,這大大的降低了開發人員維護的難度。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外,因爲構建產物的大小,很大程度上決定了部署時間的長短,所以我們使用了號稱最小的鏡像——alpine,alpine 有很多的優點:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"體積小","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有包管理器、有豐富的依賴","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大廠的支持,包含 Docker 公司在內的多家大廠官方使用","attrs":{}}]}],"attrs":{}}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是他有一個缺點,alpine 上並沒有 glibc 庫,他所使用的是一個 musl libc 的小體積替代版,但是 Java 是必須依賴的 glibc 的,不過早就有大神瞭解了這點,在 GitHub 上已經提供了預編譯的 glibc 庫,名字爲alpine-pkg-glibc,裝上這個庫就可以完美支持 Java,同時還能夠保持體積很小。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Rancher 的高可用性","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"安裝 Rancher 的方式有兩種:","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"單節點安裝和高可用集羣安裝","attrs":{}},{"type":"text","text":"。一般單節點安裝僅適用於測試或者 demo 環境,所以要正式投入使用的話,還是推薦高可用集羣的安裝方式。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們一開始測試環境就使用了單節點安裝的方式,後來因爲 Rancher Server 那臺機器出現過一次重啓,就導致了測試環境故障,雖然備份了,但是還是丟失了少量數據,最後我們測試環境也採用了 HA 高可用部署,整個架構如下圖所示。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Rancher Server 我是採用的 RKE 安裝,並且爲了防止阿里雲出現區域性的故障,我們將 Rancher Server 的三臺機器,部署在了兩個可用區,Rancher Server-001、003 在北京的 H 區、Rancher Server-002 在北京的 G 區。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"負載均衡,我們採用的是阿里雲的 SLB,也是採購的主備型實例,防止單點故障,因爲 Rancher 必須使用 SSL 證書,我們也有自己的域名證書,爲了方便在 SLB 上進行 SSL 證書的維護,我們使用的是 7 層協議,在 SLB 上做的 SSL 終止,Rancher Server 的架構圖可以參考下圖:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/9b/9be18015309977e2cbbdacf3672b9f8e.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下游集羣,也就是用來承載業務的 K8s 集羣,我們也是一半一半,在阿里雲的兩個可用區進行部署的,需要注意的是,爲了保證兩個區的網絡時延 <= 15 ms,這就完成了一個高可用的災備架構。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"備份方面,我們也使用了阿里雲 ECS 快照 + ETCD S3 協議備份到了阿里雲的 OSS 對象存儲兩種方案,確保出現故障後,能夠及時恢復服務。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"部署的詳細教程可以參考 Rancher 官方文檔:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"underline","attrs":{}}],"text":"https://docs.rancher.cn/docs/rancher2/installation_new/resources/advanced/rke-add-on/layer-7-lb/_index/","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"容器的運維","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"容器的運維,這裏主要指容器的","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"日誌收集","attrs":{}},{"type":"text","text":"和","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"容器監控","attrs":{}},{"type":"text","text":",容器監控方面呢,Rancher 自帶了 Prometheus 和 Grafana,而且和 Rancher 的 UI 有一些整合,就非常的方便,所以監控方面我就不展開講了,我主要說一說日誌收集。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 K8s 裏,日誌的收集相比傳統的物理機、虛機等方式要複雜一些,因爲 K8s 所提供的是動態的環境,像綁定 hostpath 這種方式是不適用的,我們可以通過以下這個表格直觀的對比一下:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/be/be6b8698011ab14fbe21cdbe72abd990.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以看到,K8s 需要採集的日誌種類比較多,而容器化的部署方式,在單機器內的應用數是很高的,而且都是動態的,所以傳統的採集方式是不適用於 K8s 的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目前 K8s 的採集方式大體可以分爲兩種,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"被動採集","attrs":{}},{"type":"text","text":"和","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"主動推送","attrs":{}},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"主動推送","attrs":{}},{"type":"text","text":"一般有 ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"DockerEngine","attrs":{}},{"type":"text","text":" 和 ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"業務直寫","attrs":{}},{"type":"text","text":"兩種方式:DockerEngine 是 Docker 的 LogDriver 原生自帶的,一般只能收集 STDOUT、一般不建議使用;而業務直寫,則需要在應用裏集成日誌收集的 SDK,通過 SDK 直接發送到收集端,日誌不需要落盤,也不需要部署Agent,但是業務會和 SDK 強綁定,靈活性偏低,建議對於日誌量較大,或者對日誌有定製化要求的場景使用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"被動推送","attrs":{}},{"type":"text","text":"是採用部署日誌收集 Agent 進行採集的,有兩種方式,一種是 Daemonset 每個機器節點上部署一個 Agent,還有一種 Sidecar,每個 Pod 以 Sidecar 的形式部署一個 Agent。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Sidecar 部署方式比較消耗資源,相當於每個 Pod 都有一個 agent,但是這種方式 靈活性以及隔離性較強,適合大型的 K8s 集羣或者作爲 PaaS 平臺爲業務方提供服務的羣使用,Daemonset 部署方式,資源消耗較小,適合功能單一、業務不多的集羣。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"結合我們自身的場景,屬於小規模集羣,並且業務也不算多,我們選擇了 Daemonset 的部署方式,在測試環境,我們經過調研選擇了阿里開源的一個日誌收集組件log-pilot GitHub 地址是:","attrs":{}},{"type":"text","marks":[{"type":"underline","attrs":{}}],"text":"github.com/AliyunContainerService/log-pilot","attrs":{}},{"type":"text","text":",通過結合 Elasticsearch、Kibana 等算是一個不錯的 K8s 日誌解決方案。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲我們的服務器都在阿里雲上,我們運維人員比較少只有2位,沒有精力再去維護一個大型的分佈式存儲集羣,所以我們的業務日誌選擇存儲在了阿里雲的日誌服務,所以在生產環境,我們的 K8s 也使用了阿里雲日誌服務,目前單日日誌 6億+ 沒有任何問題。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"使用阿里雲收集日誌呢,你需要開通阿里雲的日誌服務,然後安裝 Logtail 日誌組件 alibaba-log-controller Helm,這個在官方文檔裏有安裝腳本,我把文檔鏈接貼在下面,在安裝組件的過程中會自動創建aliyunlogconfigs CRD,部署alibaba-log-controller的Deployment,最後以 DaemonSet 模式安裝 Logtail。然後你就可以在控制檯,接入你想要收集的日誌了。安裝完以後是這樣的:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/4a/4a185cd4a56b546af7fc9c0a12ab5e5d.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Logtail支持採集容器內產生的文本日誌,並附加容器的相關元數據信息一起上傳到日誌服務。Kubernetes文件採集具備以下功能特點:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"只需配置容器內的日誌路徑,無需關心該路徑到宿主機的映射","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持通過Label指定採集的容器","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持通過Label排除特定容器","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持通過環境變量指定採集的容器","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持通過環境變量指定排除的容器","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持多行日誌(例如java stack日誌)","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持Docker容器數據自動打標籤","attrs":{}}]}],"attrs":{}},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持Kubernetes容器數據自動打標籤","attrs":{}}]}],"attrs":{}}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果你想了解更多,可以查看阿里雲日誌服務的官方文檔:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"underline","attrs":{}}],"text":"https://help.aliyun.com/document_detail/157317.html?spm=a2c4g.11186623.6.621.193c25f44oLO1V","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"容器的多租戶隔離","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我這裏所講的,主要指的是企業內部用戶的多租戶隔離,而不是指的 SaaS、KaaS 服務模型的多租戶隔離。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在權限方面,因爲我司對於權限的管控較嚴格,而 Rancher 恰好提供了非常方便的基於 集羣、項目、命名空間等多個粒度的權限控制,並且支持我司基於 OpenLDAP 的認證協議,非常便於管理,我可以給不同項目組的開發、測試人員開通相對應的 集羣/項目/命名空間的權限。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比如下圖,我可以給集羣添加用戶、也可以給某個 Project 添加用戶,並且可以指定幾個不同的角色,甚至可以自定義角色。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e6/e6ff2c67e50b5c7d02c8b087372bfd2e.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比如場景1:我可以給 項目組長,分配開發環境集羣->項目1 所有者(Owner)權限,然後項目組長可以自由控制給本項目添加他的成員,並分配相應權限。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"場景2:我可以給 測試經理,分配測試集羣的所有者(Owner)權限,由測試經理來分配,誰來負責哪個項目的測試部署,以及開發人員只能查看日誌等。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在資源方面,一定要進行容器的資源配額設置,如果不設置資源限額,一旦某一個應用出現了性能問題,將會影響整個 node 節點上的所有應用,K8s 會將出現問題的應用調度到其他 node 上,如果你的資源不夠,將會出現整個系統的癱瘓,導致雪崩。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/34/342332e0ca516a28e800ed6f9726d4af.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Java 應用的資源配額限制也有一個坑,因爲默認 Java 是通過 /proc/meminfo 來獲取內存信息的,默認 JVM 會使用系統內存的 25% 作爲 Max Heap Size,但是容器內的/proc/meminfo是宿主機只讀模式掛載到容器裏的,所以採取默認值是行不通的,會導致應用超過容器限制的內存配額後被OOM,而健康檢查又將服務重啓,造成應用不斷的重啓。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那是不是通過手動參數設置 JVM 內存 = 容器內存限額呢?不行!因爲 JVM消耗的內存不僅僅是 Heap,因爲 JVM 也是一個應用,它需要額外的空間去完成它的工作,你需要配置的限額應該是Metaspace + Threads + heap + JVM 進程運行所需內存 + 其他數據 關於這塊,因爲涉及到的內容較多,就不進行展開,感興趣的同學可以自己去 Google 一下。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲我們的業務場景並不複雜,所以我們的容器化之路,其實走的也相對來講蠻順暢的,我們的運維人員很少,只有 2 位,所以我們也沒有太多的時間精力去維護太多的自建系統,我們使用了很多的阿里雲產品,包括 Rancher,他很方便的部署方式,友好的 UI,包括集成好的監控等等,在容器化之路上給了我們很大的信心。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們使用構建兩層鏡像的方式,降低了開發人員的學習複雜度。使用了小體積鏡像 alpine + 預編譯 glibc 減小了鏡像體積。提高了部署的時間,在架構上,我們採用了阿里雲雙區機房的災備的架構,以及完備的備份方案。使用 Daemonset 部署的日誌收集組件,收集到阿里雲日誌服務,支撐我們 6億/日的日誌系統。Rancher 還提供給了我們深度集成的監控系統、多租戶隔離等。還有我們自己踩坑 踩出來的資源配額設置。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其實容器化並不複雜,如果沒有 K8s,我們需要自己構建健康監測系統、發版系統、維護不同的主機環境,不能細粒度的進行資源劃分,不能更有效的利用計算資源,運維的工作主要是什麼?在我看來其實就是 節約成本、提高效率。虛擬化、自動化、智能化、高性能、高可用、高併發 等等,這些無一不是圍繞着成本和效率這兩個詞,而 K8s 其實已經幫我們都做好了,而像 Rancher 這種編排平臺又幫我們降低了 K8s 的學習複雜度,所以你要做的就是加入 K8s,好了,到這裏這次的分享就結束了。感謝~","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"社區QA","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Q1:K8S在生產環境的高可用存儲方案有推薦嗎?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A1:存儲方案沒有標準答案,我們主要使用阿里雲,所以用的是阿里雲的塊存儲,比較常見的方案還有 Ceph、GlusterFS、Portworx、OpenEBS 等,他們各有優劣,需結合自己的業務需求進行選擇","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Q2:灰度發佈,Kubernetes網絡流量可以通過服務網格分流實現網絡層面的分發,但是涉及到應用大版本的更新時候,涉及到數據庫結構的變更的時候,如何實現灰度發佈?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A2:沒有遇到過這個場景,不過提供一個思路,可以準備兩套數據庫,網絡分流也可以分流到不通數據庫,具體需要你自己驗證一下是否可行","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要分清楚這是兩層,一層是邏輯層,一層是數據層,不能混爲一談","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Q3:Pipeline是用什麼做的?Pipeline下,如何處理同一個分支,需要並行測試多個版本的場景?我用Rancher的Pipeline,侷限性比較大,就是同一個分支無法並行多套進行測試。命名空間在使用,但是同一個分支下,命名空間是寫在.rancher.yml下的,所以無法區分,Rancher的Pipeline不能在外面注入變量進行區分。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A3:Rancher 的 Pipline 目前還是有一些不夠靈活,我們使用的是自建 Jenkins 做 Pipeline 的,並行測試,可以用命名空間等隔離策略進行隔離,或者準備多套測試環境","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Q4: 你們運維的Dockerfile和開發的Dockerfile是怎麼合併的?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A4:開發的 Dockerfile 是 From 運維的 Dockerfile","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Q5:你們k8s的漏洞掃描用的什麼工具?一般什麼級別的鏡像漏洞需要進行修復?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A5:暫時沒有使用漏掃工具,我們主要根據 Rancher 企業服務通知的修復建議進行修復","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Q6: 就是比如說從外網,通過service ip能夠登陸並且管理容器。想實現這一步必須通過將service ip暴露出來,然後這個service ip怎麼暴露出來?麻煩解答一下。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A6:如果需求是管理容器,其實可以使用 Rancher 的用戶權限控制,讓某一用戶擁有某一容器的權限,暴露 service ip 到公網,讓用戶管理容器是無法實現的","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Q6 : 好的,謝謝,我還有一點不明白,這個service ip有什麼辦法能讓他暴露出來呢?你意思是說讓不同的用戶通過rancher平臺去管理不同的容器嗎?麻煩再給解答一下,謝謝。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A6:可以使用 NodePort 暴露,通過 Node ip 和 端口進行訪問,或者使用 公有云的負載均衡產品","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Q6 : 我不是這個意思,我是想把service ip暴露出來,不只單單想通過集羣內部訪問。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A6:service ip 本來就是  K8s 內部的,暴露不了,只能轉發","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Q7: 爲何沒有放在3個可用區,如果可用區H掛掉,是否會導致集羣不可訪問?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A7:3個可用區當然也是可以的,Rancher HA 架構,只要有一個 Server 可用就沒有關係","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Q8:請教下你們多套開發測試環境的pipeline是怎麼樣的流程呢 (差異化)?有使用helm template嗎,方便講解下更多細節麼?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A8:目前是通過 Jenkins 部署參數,部署的時候可以選擇 命名空間、環境標識、分支等,通過 sed 修改 template","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Q9:請問你們的devops流是怎樣的呢?一個環境對應一個docker鏡像,還是說test pre prd共用一個docker鏡像呢?如果是一個docker鏡像共用test  pre prd的話是怎麼做的呢(比如不同環境的配置以及開發的協‘同開發流)?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A9:我們是用的同一個鏡像,部署時通過選擇不通的環境標識參數,程序會自動注入不同環境的配置,需要開發進行一些相應的配置修改","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Q10:不大懂容器的資源限制該如何配置,自己配置了感覺不起作用","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A10:Rancher 可以在項目、命名空間、Pod 三個粒度進行設置,優先級相反","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"作者介紹:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"田翰明,36Kr 運維開發工程師,在 36Kr 主要負責運維自動化,CI/CD 的建設,以及應用容器化的推動。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文鏈接:","attrs":{}},{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s/AJ4njSUJKDQUAIgW706L0A","title":""},"content":[{"type":"text","text":"科技創投媒體36Kr的容器化之路","attrs":{}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章