Service Mesh - Istio實戰篇(上)

項目準備和構建過程

典型的 CI/CD 過程 - DevOps

Service Mesh - Istio實戰篇(上)

GitOps 持續交付過程

  • GitOps:一種集羣管理和應用分發的持續交付方式
  • GitOps與典型的CI/CD不同,其中最大的不同點在於使用 Git 作爲信任源,保存聲明式基礎架構(declarative infrastructure)和應用程序
  • 以 Git 作爲交付過程(pipeline)的中心,配置文件如k8s的yaml文件都保存在git進行管理
  • 開發者只需要通過 pull request 完成應用的部署和運維任務,不需要去使用別的一些CI/CD工具
  • 優勢:提高生產率、改進開發體驗、一致性和標準化、安全
    Service Mesh - Istio實戰篇(上)

push vs pull 流程(pipeline):
Service Mesh - Istio實戰篇(上)

使用 Flux 構建和發佈應用

Flux 官方定義:

  • The GitOps operator for Kubernetes
  • 自動化部署工具(基於 GitOps)
  • 特性:
    • 自動同步、自動部署
    • 聲明式
    • 基於代碼(Pull request),而不是容器
      Service Mesh - Istio實戰篇(上)

準備工作

首先,我們需要準備一個Kubernetes集羣:

以及在k8s中安裝好 Istio 環境:

如下圖所示,我們要部署一個由兩個服務組成的Mesh,除此之外還會有一個網關和一個外部服務,可以說是精簡且完整了:
Service Mesh - Istio實戰篇(上)

  • 在調用鏈路上可以看出 sleep 是作爲客戶端的角色,httpbin 作爲服務端的角色

準備一個 Git 倉庫:
Service Mesh - Istio實戰篇(上)

安裝 Flux

官方文檔:

首先,安裝 fluxctl 命令工具,到Github倉庫上下載可執行文件即可。然後將其放到 /usr/bin 目錄下,並賦予可執行權限:

[root@m1 /usr/local/src]# mv fluxctl_linux_amd64 /usr/bin/fluxctl
[root@m1 ~]# chmod a+x /usr/bin/fluxctl 
[root@m1 ~]# fluxctl version
1.21.0
[root@m1 ~]# 

給 Flux 創建一個命名空間,然後將 Flux Operator 部署到k8s集羣:

[root@m1 ~]# kubectl create ns flux
namespace/flux created
[root@m1 ~]# git clone https://github.com/fluxcd/flux.git
[root@m1 ~]# cd flux/

在部署 Flux 之前,需要先修改幾個Git相關的配置,修改爲你Git倉庫的用戶名、郵箱、url等:

[root@m1 ~/flux]# vim deploy/flux-deployment.yaml  # 修改如下幾個配置項
...
        # Replace the following URL to change the Git repository used by Flux.
        # HTTP basic auth credentials can be supplied using environment variables:
        # https://$(GIT_AUTHUSER):$(GIT_AUTHKEY)@github.com/user/repository.git
        - [email protected]:fluxcd/flux-get-started
        - --git-branch=master
        # Include this if you want to restrict the manifests considered by flux
        # to those under the following relative paths in the git repository
        # - --git-path=subdir1,subdir2
        - --git-label=flux-sync
        - --git-user=Flux automation
        - [email protected]

修改完成後,進行部署:

[root@m1 ~/flux]# kubectl apply -f deploy
[root@m1 ~/flux]# kubectl get all -n flux
NAME                            READY   STATUS    RESTARTS   AGE
pod/flux-65479fb87-k5zxb        1/1     Running   0          7m20s
pod/memcached-c86cd995d-5gl5p   1/1     Running   0          44m

NAME                TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)     AGE
service/memcached   ClusterIP   10.106.229.44   <none>        11211/TCP   44m

NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/flux        1/1     1            1           44m
deployment.apps/memcached   1/1     1            1           44m

NAME                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/flux-65479fb87        1         1         1       7m20s
replicaset.apps/memcached-c86cd995d   1         1         1       44m
[root@m1 ~]# 

除了以上方式,也可以使用命令行部署 Flux:

fluxctl install \
--git-user=xxx \
--git-email=xxx@xxx \
[email protected]:xxx/smdemo \
--namespace=flux | kubectl apply -f -

由於使用的是私有倉庫,我們還需要一些額外的操作,需要將其主機密鑰添加到Flux daemon容器中的 ~/.ssh/known_hosts 文件中。具體步驟如下:

[root@m1 ~]# kubectl exec -n flux flux-65479fb87-k5zxb -ti -- \
    env GITHOST="gitee.com" GITREPO="[email protected]:demo_focus/service-mesh-demo.git" PS1="container$ " /bin/sh
container$ ssh-keyscan $GITHOST >> ~/.ssh/known_hosts   # 添加host key
container$ git clone $GITREPO   # 測試確保能正常對倉庫進行克隆
Cloning into 'service-mesh-demo'...
remote: Enumerating objects: 10, done.
remote: Counting objects: 100% (10/10), done.
remote: Compressing objects: 100% (10/10), done.
remote: Total 10 (delta 2), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (10/10), done.
Resolving deltas: 100% (2/2), done.
container$ 

完成 Flux 的部署後,我們需要將 Flux 生成的 deploy key 添加到 git 倉庫中(read/write 權限),獲取 deploy key 的命令如下:

[root@m1 ~]# fluxctl identity --k8s-fwd-ns flux
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDsyfN+x4jen+Ikpff8LszXLFTwXSQviFxCrIx7uMy7LJM5uUEsDdFs/DZL1g9h/YnkfLJlFrxOCJ+tuqPrXuj3ceEFfal4T3YWiDwf1RsGJvJd6ED5APjsxyu5gkj9LvkOB8OlYwPlS8Pygv997n93gtH7rFbocK5EQpbhhBlue3Or2ufI/KBxDCx6xLaH9U/16EEi+BDVSsCetGIQI+TSRqqpN30+Y8paS6iCYajKTubKv7x44WaVFgSDT9Y/OycUq1LupJoVoD8/5Y2leUMaF9dhMbQgoc8zjh8q2HF2n97mAvgYWJosjeIcAKS82C0zPlPupPevNedAhhEb82svPWh7BI4N4XziA06ypAEmfEz3JuUTTeABpF2hEoV4UEagkSyS8T3xhfdjigVcKiBW5AqRsRyx+ffW4WREHjARSC8CKl0Oj00a9FOGoNsDKkFuTbJePMcGdgvjs61UlgUUjdQFfHoZz2UVo2OEynnCpY7hj5SrEudkujRon4HEhJE= root@flux-7f5f7776df-l65lx
[root@m1 ~]# 

複製密鑰內容,到Git倉庫上進行添加:
Service Mesh - Istio實戰篇(上)

部署應用

爲應用創建一個單獨的命名空間,並且爲其添加 istio-injection=enabled 標籤,讓 Istio 可以注入代理:

[root@m1 ~]# kubectl create ns demo
namespace/demo created
[root@m1 ~]# kubectl label namespace demo istio-injection=enabled
namespace/demo labeled
[root@m1 ~]# 

將Git倉庫克隆到本地,在倉庫下創建 config 目錄:

[root@m1 ~]# git clone [email protected]:demo_focus/service-mesh-demo.git
[root@m1 ~]# cd service-mesh-demo/
[root@m1 ~/service-mesh-demo]# mkdir config

在該目錄下創建服務的配置文件:

[root@m1 ~/service-mesh-demo]# vim config/httpbin.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: httpbin
  namespace: demo
---
apiVersion: v1
kind: Service
metadata:
  name: httpbin
  namespace: demo
  labels:
    app: httpbin
spec:
  ports:
  - name: http
    port: 8000
    targetPort: 80
  selector:
    app: httpbin
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpbin
  namespace: demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpbin
      version: v1
  template:
    metadata:
      labels:
        app: httpbin
        version: v1
    spec:
      serviceAccountName: httpbin
      containers:
      - image: docker.io/kennethreitz/httpbin
        imagePullPolicy: IfNotPresent
        name: httpbin
        ports:
        - containerPort: 80

[root@m1 ~/service-mesh-demo]# vim config/sleep.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: sleep
  namespace: demo
---
apiVersion: v1
kind: Service
metadata:
  name: sleep
  namespace: demo
  labels:
    app: sleep
spec:
  ports:
  - port: 80
    name: http
  selector:
    app: sleep
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sleep
  namespace: demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sleep
  template:
    metadata:
      labels:
        app: sleep
    spec:
      serviceAccountName: sleep
      containers:
      - name: sleep
        image: governmentpaas/curl-ssl
        command: ["/bin/sleep", "3650d"]
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - mountPath: /etc/sleep/tls
          name: secret-volume
      volumes:
      - name: secret-volume
        secret:
          secretName: sleep-secret
          optional: true

將配置文件提交到遠程倉庫,更新 git repo:

[root@m1 ~/service-mesh-demo]# git add .
[root@m1 ~/service-mesh-demo]# git commit -m "commit yaml"
[root@m1 ~/service-mesh-demo]# git push origin master

執行如下命令,讓 Flux 去同步倉庫的變更,並進行自動部署:

[root@m1 ~]# fluxctl sync --k8s-fwd-ns flux
Synchronizing with ssh://[email protected]/demo_focus/service-mesh-demo
Revision of master to apply is 49bc37e
Waiting for 49bc37e to be applied ...
Done.
[root@m1 ~]# 
  • 默認情況下,Flux 會每隔5分鐘自動進行 sync,並不需要我們手動去操作

此時查看 demo 命名空間下的資源,可以看到 Flux 自動幫我們部署了所有服務:

[root@m1 ~]# kubectl get all -n demo
NAME                           READY   STATUS    RESTARTS   AGE
pod/httpbin-74fb669cc6-v9lc5   2/2     Running   0          36s
pod/sleep-854565cb79-mcmnb     2/2     Running   0          40s

NAME              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/httpbin   ClusterIP   10.105.17.57    <none>        8000/TCP   36s
service/sleep     ClusterIP   10.103.14.114   <none>        80/TCP     40s

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/httpbin   1/1     1            1           36s
deployment.apps/sleep     1/1     1            1           40s

NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/httpbin-74fb669cc6   1         1         1       36s
replicaset.apps/sleep-854565cb79     1         1         1       40s
[root@m1 ~]# 

測試服務之間的連通性是否正常:

[root@m1 ~]# kubectl exec -it -n demo sleep-854565cb79-mcmnb -c sleep -- curl http://httpbin.demo:8000/ip
{
  "origin": "127.0.0.1"
}
[root@m1 ~]# 

實現自動化灰度發佈

灰度發佈過程

Service Mesh - Istio實戰篇(上)

自動化灰度發佈 - Flagger

灰度發佈是個一點點遷移流量進行滾動升級的過程,因此如果通過人工手動來操作這個過程顯然效率低下、容易出錯,所以我們就需要使用自動灰度發佈工具,例如 Flagger:

  • Flagger:Weaveworks開源的自動灰度發佈工具
  • 支持多種 Service Mesh 產品:Istio、Linkerd、App AWS Mesh
  • 指標監控灰度發佈狀態
  • 通知(slack、Microsoft team)
    Service Mesh - Istio實戰篇(上)

Flagger 工作流程:
Service Mesh - Istio實戰篇(上)

Flagger 安裝

官方文檔:

添加 Flagger 的 Helm 倉庫:

[root@m1 ~]# helm repo add flagger https://flagger.app
"flagger" has been added to your repositories
[root@m1 ~]# 

創建 Flagger 的 crd:

[root@m1 ~]# kubectl apply -f https://raw.githubusercontent.com/fluxcd/flagger/main/artifacts/flagger/crd.yaml
[root@m1 ~]# kubectl get crd |grep flagger
alertproviders.flagger.app                            2020-12-23T14:40:00Z
canaries.flagger.app                                  2020-12-23T14:40:00Z
metrictemplates.flagger.app                           2020-12-23T14:40:00Z
[root@m1 ~]# 

通過 Helm 把 Flagger 部署到 istio-system 命名空間下:

[root@m1 ~]# helm upgrade -i flagger flagger/flagger \
--namespace=istio-system \
--set crd.create=false \
--set meshProvider=istio \
--set metricsServer=http://prometheus.istio-system:9090

添加一個slack的hooks到flagger裏,可以讓flagger發送通知到slack頻道里,這一步是可選的:

[root@m1 ~]# helm upgrade -i flagger flagger/flagger \
--namespace=istio-system \
--set crd.create=false \
--set slack.url=https://hooks.slack.com/services/xxxxxx \
--set slack.channel=general \
--set slack.user=flagger

除了slack外,我們還可以爲flagger配置一個grafana,該grafana集成了一個canary dashboard,可以方便我們去查看灰度發佈的進度:

[root@m1 ~]# helm upgrade -i flagger-grafana flagger/grafana \
--namespace=istio-system \
--set url=http://prometheus.istio-system:9090 \
--set user=admin \
--set password=admin

以上操作完成後,確認下flagger的部署情況:

[root@m1 ~]# kubectl get pods -n istio-system 
NAME                                    READY   STATUS    RESTARTS   AGE
flagger-b68b578b-5f8bh                  1/1     Running   0          7m50s
flagger-grafana-77b8c8df65-7vv89        1/1     Running   0          71s
...

爲網格創建一個ingress網關:

[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: public-gateway
  namespace: istio-system
spec:
  selector:
    istio: ingressgateway
  servers:
    - port:
        number: 80
        name: http
        protocol: HTTP
      hosts:
        - "*"
EOF

另外,我們還可以部署一個負載測試工具,當然這也是可選的:

[root@m1 ~]# kubectl create ns test
namespace/test created
[root@m1 ~]# kubectl apply -k https://github.com/fluxcd/flagger/tree/main/kustomize/tester
[root@m1 ~]# kubectl get pods -n test
NAME                                  READY   STATUS    RESTARTS   AGE
flagger-loadtester-64695f854f-5hsmg   1/1     Running   0          114s
[root@m1 ~]# 

如果上面這種方式比較慢的話也可以將倉庫克隆下來,然後對 tester 進行部署:

[root@m1 ~]# cd /usr/local/src
[root@m1 /usr/local/src]# git clone https://github.com/fluxcd/flagger.git
[root@m1 /usr/local/src]# kubectl apply -k flagger/kustomize/tester/

灰度發佈配置

爲 httpbin 服務配置HAP,讓它可以支持動態伸縮,這也是可選的,但通常建議將HAP配置上:

[root@m1 ~]# kubectl apply -n demo -f - <<EOF
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: httpbin
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: httpbin
  minReplicas: 2
  maxReplicas: 4
  metrics:
  - type: Resource
    resource:
      name: cpu
      # scale up if usage is above
      # 99% of the requested CPU (100m)
      targetAverageUtilization: 99
EOF

創建用於驗證灰度發佈的 metric ,falgger會根據該指標逐漸遷移流量:

[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: flagger.app/v1beta1
kind: MetricTemplate
metadata:
  name: latency
  namespace: istio-system
spec:
  provider:
    type: prometheus
    address: http://prometheus.istio-system:9090
  query: |
    histogram_quantile(
        0.99,
        sum(
            rate(
                istio_request_duration_milliseconds_bucket{
                    reporter="destination",
                    destination_workload_namespace="{{ namespace }}",
                    destination_workload=~"{{ target }}"
                }[{{ interval }}]
            )
        ) by (le)
    )
EOF

創建 flagger 的 canary,具體的配置內容如下,灰度發佈的相關配置信息都定義在這裏:

apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
  name: httpbin
  namespace: demo
spec:
  # deployment reference
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: httpbin
  # the maximum time in seconds for the canary deployment
  # to make progress before it is rollback (default 600s)
  progressDeadlineSeconds: 60
  # HPA reference (optional)
  autoscalerRef:
    apiVersion: autoscaling/v2beta1
    kind: HorizontalPodAutoscaler
    name: httpbin
  service:
    # service port number
    port: 8000
    # container port number or name (optional)
    targetPort: 80
    # Istio gateways (optional)
    gateways:
    - public-gateway.istio-system.svc.cluster.local
  analysis:
    # schedule interval (default 60s)
    interval: 30s
    # max number of failed metric checks before rollback
    threshold: 5
    # max traffic percentage routed to canary
    # percentage (0-100)
    maxWeight: 100
    # canary increment step
    # percentage (0-100)
    stepWeight: 20
    metrics:
    - name: request-success-rate
      # minimum req success rate (non 5xx responses)
      # percentage (0-100)
      thresholdRange:
        min: 99
      interval: 1m
    - name: latency
      templateRef:
        name: latency
        namespace: istio-system
      # maximum req duration P99
      # milliseconds
      thresholdRange:
        max: 500
      interval: 30s
    # testing (optional)
    webhooks:
      - name: load-test
        url: http://flagger-loadtester.test/
        timeout: 5s
        metadata:
          cmd: "hey -z 1m -q 10 -c 2 http://httpbin-canary.demo:8000/headers"

創建了 Canary 後,會發現它在集羣中自動爲 httpbin 創建了一些帶 primary 命名的資源,還會創建一個Virtual Service,其路由規則指向 httpbin-primary 和 httpbin-canary 服務:

[root@m1 ~]# kubectl get pods -n demo
NAME                             READY   STATUS    RESTARTS   AGE
httpbin-74fb669cc6-6ztkg         2/2     Running   0          50s
httpbin-74fb669cc6-vfs4h         2/2     Running   0          38s
httpbin-primary-9cb49747-94s4z   2/2     Running   0          3m3s
httpbin-primary-9cb49747-xhpcg   2/2     Running   0          3m3s
sleep-854565cb79-mcmnb           2/2     Running   0          94m
[root@m1 ~]# kubectl get svc -n demo
NAME              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
httpbin           ClusterIP   10.105.17.57    <none>        8000/TCP   86m
httpbin-canary    ClusterIP   10.99.206.196   <none>        8000/TCP   3m14s
httpbin-primary   ClusterIP   10.98.196.235   <none>        8000/TCP   3m14s
sleep             ClusterIP   10.103.14.114   <none>        80/TCP     95m
[root@m1 ~]# kubectl get vs -n demo
NAME      GATEWAYS                                            HOSTS         AGE
httpbin   ["public-gateway.istio-system.svc.cluster.local"]   ["httpbin"]   3m29s
[root@m1 ~]# 

然後我們使用如下命令觸發灰度:

[root@m1 ~]# kubectl -n demo set image deployment/httpbin httpbin=httpbin-v2
deployment.apps/httpbin image updated
[root@m1 ~]# 
  • Tips:dep、configmap、secret 都會觸發

查看 canary 的事件,可以看到已經檢測到新版本了:

[root@m1 ~]# kubectl describe canary httpbin -n demo
...
Events:
  Type     Reason  Age                  From     Message
  ----     ------  ----                 ----     -------
  ...
  Normal   Synced  2m57s                flagger  New revision detected! Scaling up httpbin.demo
  Warning  Synced  27s (x5 over 2m27s)  flagger  canary deployment httpbin.demo not ready: waiting for rollout to finish: 1 out of 2 new replicas have been updated

此時查看 httpbin 的Virtual Service,會發現已經將20%的流量切換到灰度發佈版本了:

[root@m1 ~]# kubectl describe vs httpbin -n demo
...
Spec:
  Gateways:
    public-gateway.istio-system.svc.cluster.local
  Hosts:
    httpbin
  Http:
    Route:
      Destination:
        Host:  httpbin-primary
      Weight:  80
      Destination:
        Host:  httpbin-canary
      Weight:  20
Events:        <none>

然後進入 sleep 服務中,使用腳本循環訪問 httpbin 服務:

[root@m1 ~]# kubectl exec -it -n demo sleep-854565cb79-mcmnb -c sleep -- sh
/ # while [ 1 ]; do curl http://httpbin.demo:8000/headers;sleep 2s; done

再次查看 httpbin 的Virtual Service,會發現已經將60%的流量切換到灰度發佈版本了:

[root@m1 ~]# kubectl describe vs httpbin -n demo
...
Spec:
  Gateways:
    public-gateway.istio-system.svc.cluster.local
  Hosts:
    httpbin
  Http:
    Route:
      Destination:
        Host:  httpbin-primary
      Weight:  40
      Destination:
        Host:  httpbin-canary
      Weight:  60
Events:        <none>

我們可以打開flagger的Grafana:

[root@m1 ~]# kubectl -n istio-system port-forward svc/flagger-grafana 3000:80 --address 192.168.243.138
Forwarding from 192.168.243.138:3000 -> 3000

內置瞭如下dashboard:
Service Mesh - Istio實戰篇(上)

在 Istio Canary Dashboard 可以查看發佈過程:
Service Mesh - Istio實戰篇(上)

最終將100%的流量切換到灰度發佈版本代表發佈完成:

[root@m1 ~]# kubectl describe vs httpbin -n demo
...
Spec:
  Gateways:
    public-gateway.istio-system.svc.cluster.local
  Hosts:
    httpbin
  Http:
    Route:
      Destination:
        Host:  httpbin-primary
      Weight:  0
      Destination:
        Host:  httpbin-canary
      Weight:  100
Events:        <none>

從 canary httpbin 的事件日誌中也可以看到流量遷移的過程:

[root@m1 ~]# kubectl describe canary httpbin -n demo
  ...
  Normal   Synced  3m44s (x2 over 18m)  flagger  New revision detected! Restarting analysis for httpbin.demo
  Normal   Synced  3m14s (x2 over 18m)  flagger  Starting canary analysis for httpbin.demo
  Normal   Synced  3m14s (x2 over 18m)  flagger  Advance httpbin.demo canary weight 20
  Warning  Synced  2m44s (x2 over 17m)  flagger  Halt advancement no values found for istio metric request-success-rate probably httpbin.demo is not receiving traffic: running query failed: no values found
  Normal   Synced  2m14s                flagger  Advance httpbin.demo canary weight 40
  Normal   Synced  104s                 flagger  Advance httpbin.demo canary weight 60
  Normal   Synced  74s                  flagger  Advance httpbin.demo canary weight 80
  Normal   Synced  44s                  flagger  Advance httpbin.demo canary weight 100

當發佈完成後,canary httpbin 的狀態就會變更爲 Succeeded :

[root@m1 ~]# kubectl get canary -n demo
NAME      STATUS        WEIGHT   LASTTRANSITIONTIME
httpbin   Succeeded   0        2020-12-23T16:03:04Z
[root@m1 ~]# 

提升系統的彈性能力

彈性設計目前在很多領域都很流行,例如環境景觀設計中的彈性是指具有一定的災後恢復能力,但災難發生之後景觀可以快速地恢復它的結構和功能。在產品設計中,一般彈性是指對產品形態特徵等設計時,留有一定的餘地,方便修改。

分佈式系統中的彈性一般是指讓系統具有一定的容錯能力和應對能力,在故障發生時能夠快速恢復,能夠應對故障。本小節我們就來爲之前部署的示例應用增加一些彈性能力。

系統可用性度量

我們先來了解一個概念:服務級別協議(SLA – Service Level Agreement)。服務級別協議是指提供服務的企業與客戶之間就服務的品質、水準、性能等方面所達成的雙方共同認可的協議或契約。 例如通常一個服務的提供商都會跟客戶保證自己的服務具有什麼級別的可用性,也就是我們平時說的幾個9的可用性級別。

系統的可用性計算公式:
Service Mesh - Istio實戰篇(上)

常見的可用性級別如下:
Service Mesh - Istio實戰篇(上)

彈性設計

  • 應對故障的一種方法,就是讓系統具有容錯和適應能力
  • 防止故障(Fault)轉化爲失敗(Failure)
  • 主要包括:
    • 容錯性:重試、冪等
    • 伸縮性:自動水平擴展(autoscaling)
    • 過載保護:超時、熔斷、降級、限流
    • 彈性測試:故障注入

Istio 所提供的彈性能力:

  • 超時
  • 重試
  • 熔斷
  • 故障注入

爲 demo 應用提供彈性能力

首先,我們爲 demo 應用創建一個Virtual Service:

[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: httpbin
  namespace: demo
spec:
  hosts:
  - "*"
  gateways:
  - httpbin-gateway
  http:
  - route:
    - destination:
        host: httpbin
        port:
          number: 8000
EOF

添加第一個彈性能力:配置超時,配置如下所示:

[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: httpbin
  namespace: demo
spec:
  hosts:
  - "*"
  gateways:
  - httpbin-gateway
  http:
  - route:
    - destination:
        host: httpbin
        port:
          number: 8000
    timeout: 1s  # 配置超時
EOF

超時配置規則:

  • timeout & retries.perTryTimout 同時存在時
  • 超時生效 = min (timeout, retry.perTryTimout * retry.attempts)

在超時的基礎上,我們還可以配置重試策略:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: httpbin
  namespace: demo
spec:
  hosts:
  - "*"
  gateways:
  - httpbin-gateway
  http:
  - route:
    - destination:
        host: httpbin
        port:
          number: 8000
    retry:  # 配置重試策略
      attempts: 3
      perTryTimeout: 1s
    timeout: 8s

重試配置項:
Service Mesh - Istio實戰篇(上)

  • x-envoy-retry-on:5xx, gateway-error, reset, connect-failure…
  • x-envoy-retry-grpc-on:cancelled, deadline-exceeded, internal, unavailable…

配置熔斷:

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: httpbin
  namespace: demo
spec:
  host: httpbin
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 1
      http:
        http1MaxPendingRequests: 1
        maxRequestsPerConnection: 1
      outlierDetection:
        consecutiveErrors: 1
        interval: 1s
        baseEjectionTime: 3m
        maxEjectionPercent: 100

熔斷配置:

  • TCP 和 HTTP 連接池大小爲 1
  • 只容許出錯 1 次
  • 每秒 1 次請求計數
  • 可以從負載池中移除全部 pod
  • 發生故障的 pod 移除 3m 之後才能再次加入

配置安全策略

Istio 的安全解決方案

Service Mesh - Istio實戰篇(上)

Istio 安全架構

Service Mesh - Istio實戰篇(上)

實戰

對特定的服務(httpbin)創建授權,注意沒有配置rule,表示deny當前服務:

[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: httpbin
  namespace: demo
spec:
  selector:
    matchLabels:
      app: httpbin
EOF

以上配置的意思就是對於這個服務完全不可訪問,我們可以測試一下:

# 請求被拒絕
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/get"
RBAC: access denied  # 響應

# 其他版本可以正常訪問
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin-v2.demo:8000/get"

我們可以通過如下配置對請求來源進行限定,例如請求來源必須是 demo 這個命名空間:

[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: httpbin
 namespace: demo
spec:
 action: ALLOW
 rules:
 - from:
   - source:
       principals: ["cluster.local/ns/demo/sa/sleep"]
   - source:
       namespaces: ["demo"]
EOF

測試:

# 請求通過
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/get"

# 請求被拒絕
$ kubectl exec -it -n ${other_namespace} ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/get"

# 修改service account爲${other_namespace}後,通過

除了限定請求來源外,還可以限定只有特定的接口允許被訪問:

[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: httpbin
 namespace: demo
spec:
 action: ALLOW
 rules:
 - from:
   - source:
       principals: ["cluster.local/ns/demo/sa/sleep"]
   - source:
       namespaces: ["demo"]
   to:
   - operation:
       methods: ["GET"]
       paths: ["/get"]
EOF

測試:

# 請求通過
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/get"

# 請求被拒絕
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/ip"

還可以配置其他特定條件,例如限定請求頭,通常用於我們需要客戶端攜帶特定的請求頭才允許訪問接口的場景:

[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: httpbin
 namespace: demo
spec:
 action: ALLOW
 rules:
 - from:
   - source:
       principals: ["cluster.local/ns/demo/sa/sleep"]
   - source:
       namespaces: ["demo"]
   to:
   - operation:
       methods: ["GET"]
       paths: ["/get"]
   when:
   - key: request.headers[x-rfma-token]
     values: ["test*"]
EOF

測試:

# 請求不通過
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/get"

# 加token後通過
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/get" -H x-rfma-token:test1

下篇:

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章