項目準備和構建過程
典型的 CI/CD 過程 - DevOps
GitOps 持續交付過程
- GitOps:一種集羣管理和應用分發的持續交付方式
- GitOps與典型的CI/CD不同,其中最大的不同點在於使用 Git 作爲信任源,保存聲明式基礎架構(declarative infrastructure)和應用程序
- 以 Git 作爲交付過程(pipeline)的中心,配置文件如k8s的yaml文件都保存在git進行管理
- 開發者只需要通過 pull request 完成應用的部署和運維任務,不需要去使用別的一些CI/CD工具
- 優勢:提高生產率、改進開發體驗、一致性和標準化、安全
push vs pull 流程(pipeline):
使用 Flux 構建和發佈應用
Flux 官方定義:
- The GitOps operator for Kubernetes
- 自動化部署工具(基於 GitOps)
- 特性:
- 自動同步、自動部署
- 聲明式
- 基於代碼(Pull request),而不是容器
準備工作
首先,我們需要準備一個Kubernetes集羣:
以及在k8s中安裝好 Istio 環境:
如下圖所示,我們要部署一個由兩個服務組成的Mesh,除此之外還會有一個網關和一個外部服務,可以說是精簡且完整了:
- 在調用鏈路上可以看出 sleep 是作爲客戶端的角色,httpbin 作爲服務端的角色
準備一個 Git 倉庫:
安裝 Flux
官方文檔:
- https://docs.fluxcd.io/en/latest/tutorials/get-started/
- https://docs.fluxcd.io/en/latest/guides/use-private-git-host/
首先,安裝 fluxctl
命令工具,到Github倉庫上下載可執行文件即可。然後將其放到 /usr/bin
目錄下,並賦予可執行權限:
[root@m1 /usr/local/src]# mv fluxctl_linux_amd64 /usr/bin/fluxctl
[root@m1 ~]# chmod a+x /usr/bin/fluxctl
[root@m1 ~]# fluxctl version
1.21.0
[root@m1 ~]#
給 Flux 創建一個命名空間,然後將 Flux Operator 部署到k8s集羣:
[root@m1 ~]# kubectl create ns flux
namespace/flux created
[root@m1 ~]# git clone https://github.com/fluxcd/flux.git
[root@m1 ~]# cd flux/
在部署 Flux 之前,需要先修改幾個Git相關的配置,修改爲你Git倉庫的用戶名、郵箱、url等:
[root@m1 ~/flux]# vim deploy/flux-deployment.yaml # 修改如下幾個配置項
...
# Replace the following URL to change the Git repository used by Flux.
# HTTP basic auth credentials can be supplied using environment variables:
# https://$(GIT_AUTHUSER):$(GIT_AUTHKEY)@github.com/user/repository.git
- [email protected]:fluxcd/flux-get-started
- --git-branch=master
# Include this if you want to restrict the manifests considered by flux
# to those under the following relative paths in the git repository
# - --git-path=subdir1,subdir2
- --git-label=flux-sync
- --git-user=Flux automation
- [email protected]
修改完成後,進行部署:
[root@m1 ~/flux]# kubectl apply -f deploy
[root@m1 ~/flux]# kubectl get all -n flux
NAME READY STATUS RESTARTS AGE
pod/flux-65479fb87-k5zxb 1/1 Running 0 7m20s
pod/memcached-c86cd995d-5gl5p 1/1 Running 0 44m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/memcached ClusterIP 10.106.229.44 <none> 11211/TCP 44m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/flux 1/1 1 1 44m
deployment.apps/memcached 1/1 1 1 44m
NAME DESIRED CURRENT READY AGE
replicaset.apps/flux-65479fb87 1 1 1 7m20s
replicaset.apps/memcached-c86cd995d 1 1 1 44m
[root@m1 ~]#
除了以上方式,也可以使用命令行部署 Flux:
fluxctl install \
--git-user=xxx \
--git-email=xxx@xxx \
[email protected]:xxx/smdemo \
--namespace=flux | kubectl apply -f -
由於使用的是私有倉庫,我們還需要一些額外的操作,需要將其主機密鑰添加到Flux daemon容器中的 ~/.ssh/known_hosts
文件中。具體步驟如下:
[root@m1 ~]# kubectl exec -n flux flux-65479fb87-k5zxb -ti -- \
env GITHOST="gitee.com" GITREPO="[email protected]:demo_focus/service-mesh-demo.git" PS1="container$ " /bin/sh
container$ ssh-keyscan $GITHOST >> ~/.ssh/known_hosts # 添加host key
container$ git clone $GITREPO # 測試確保能正常對倉庫進行克隆
Cloning into 'service-mesh-demo'...
remote: Enumerating objects: 10, done.
remote: Counting objects: 100% (10/10), done.
remote: Compressing objects: 100% (10/10), done.
remote: Total 10 (delta 2), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (10/10), done.
Resolving deltas: 100% (2/2), done.
container$
完成 Flux 的部署後,我們需要將 Flux 生成的 deploy key 添加到 git 倉庫中(read/write 權限),獲取 deploy key 的命令如下:
[root@m1 ~]# fluxctl identity --k8s-fwd-ns flux
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDsyfN+x4jen+Ikpff8LszXLFTwXSQviFxCrIx7uMy7LJM5uUEsDdFs/DZL1g9h/YnkfLJlFrxOCJ+tuqPrXuj3ceEFfal4T3YWiDwf1RsGJvJd6ED5APjsxyu5gkj9LvkOB8OlYwPlS8Pygv997n93gtH7rFbocK5EQpbhhBlue3Or2ufI/KBxDCx6xLaH9U/16EEi+BDVSsCetGIQI+TSRqqpN30+Y8paS6iCYajKTubKv7x44WaVFgSDT9Y/OycUq1LupJoVoD8/5Y2leUMaF9dhMbQgoc8zjh8q2HF2n97mAvgYWJosjeIcAKS82C0zPlPupPevNedAhhEb82svPWh7BI4N4XziA06ypAEmfEz3JuUTTeABpF2hEoV4UEagkSyS8T3xhfdjigVcKiBW5AqRsRyx+ffW4WREHjARSC8CKl0Oj00a9FOGoNsDKkFuTbJePMcGdgvjs61UlgUUjdQFfHoZz2UVo2OEynnCpY7hj5SrEudkujRon4HEhJE= root@flux-7f5f7776df-l65lx
[root@m1 ~]#
複製密鑰內容,到Git倉庫上進行添加:
部署應用
爲應用創建一個單獨的命名空間,並且爲其添加 istio-injection=enabled
標籤,讓 Istio 可以注入代理:
[root@m1 ~]# kubectl create ns demo
namespace/demo created
[root@m1 ~]# kubectl label namespace demo istio-injection=enabled
namespace/demo labeled
[root@m1 ~]#
將Git倉庫克隆到本地,在倉庫下創建 config
目錄:
[root@m1 ~]# git clone [email protected]:demo_focus/service-mesh-demo.git
[root@m1 ~]# cd service-mesh-demo/
[root@m1 ~/service-mesh-demo]# mkdir config
在該目錄下創建服務的配置文件:
[root@m1 ~/service-mesh-demo]# vim config/httpbin.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: httpbin
namespace: demo
---
apiVersion: v1
kind: Service
metadata:
name: httpbin
namespace: demo
labels:
app: httpbin
spec:
ports:
- name: http
port: 8000
targetPort: 80
selector:
app: httpbin
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpbin
namespace: demo
spec:
replicas: 1
selector:
matchLabels:
app: httpbin
version: v1
template:
metadata:
labels:
app: httpbin
version: v1
spec:
serviceAccountName: httpbin
containers:
- image: docker.io/kennethreitz/httpbin
imagePullPolicy: IfNotPresent
name: httpbin
ports:
- containerPort: 80
[root@m1 ~/service-mesh-demo]# vim config/sleep.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: sleep
namespace: demo
---
apiVersion: v1
kind: Service
metadata:
name: sleep
namespace: demo
labels:
app: sleep
spec:
ports:
- port: 80
name: http
selector:
app: sleep
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: sleep
namespace: demo
spec:
replicas: 1
selector:
matchLabels:
app: sleep
template:
metadata:
labels:
app: sleep
spec:
serviceAccountName: sleep
containers:
- name: sleep
image: governmentpaas/curl-ssl
command: ["/bin/sleep", "3650d"]
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /etc/sleep/tls
name: secret-volume
volumes:
- name: secret-volume
secret:
secretName: sleep-secret
optional: true
將配置文件提交到遠程倉庫,更新 git repo:
[root@m1 ~/service-mesh-demo]# git add .
[root@m1 ~/service-mesh-demo]# git commit -m "commit yaml"
[root@m1 ~/service-mesh-demo]# git push origin master
執行如下命令,讓 Flux 去同步倉庫的變更,並進行自動部署:
[root@m1 ~]# fluxctl sync --k8s-fwd-ns flux
Synchronizing with ssh://[email protected]/demo_focus/service-mesh-demo
Revision of master to apply is 49bc37e
Waiting for 49bc37e to be applied ...
Done.
[root@m1 ~]#
- 默認情況下,Flux 會每隔5分鐘自動進行 sync,並不需要我們手動去操作
此時查看 demo 命名空間下的資源,可以看到 Flux 自動幫我們部署了所有服務:
[root@m1 ~]# kubectl get all -n demo
NAME READY STATUS RESTARTS AGE
pod/httpbin-74fb669cc6-v9lc5 2/2 Running 0 36s
pod/sleep-854565cb79-mcmnb 2/2 Running 0 40s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/httpbin ClusterIP 10.105.17.57 <none> 8000/TCP 36s
service/sleep ClusterIP 10.103.14.114 <none> 80/TCP 40s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/httpbin 1/1 1 1 36s
deployment.apps/sleep 1/1 1 1 40s
NAME DESIRED CURRENT READY AGE
replicaset.apps/httpbin-74fb669cc6 1 1 1 36s
replicaset.apps/sleep-854565cb79 1 1 1 40s
[root@m1 ~]#
測試服務之間的連通性是否正常:
[root@m1 ~]# kubectl exec -it -n demo sleep-854565cb79-mcmnb -c sleep -- curl http://httpbin.demo:8000/ip
{
"origin": "127.0.0.1"
}
[root@m1 ~]#
實現自動化灰度發佈
灰度發佈過程
自動化灰度發佈 - Flagger
灰度發佈是個一點點遷移流量進行滾動升級的過程,因此如果通過人工手動來操作這個過程顯然效率低下、容易出錯,所以我們就需要使用自動灰度發佈工具,例如 Flagger:
- Flagger:Weaveworks開源的自動灰度發佈工具
- 支持多種 Service Mesh 產品:Istio、Linkerd、App AWS Mesh
- 指標監控灰度發佈狀態
- 通知(slack、Microsoft team)
Flagger 工作流程:
Flagger 安裝
官方文檔:
添加 Flagger 的 Helm 倉庫:
[root@m1 ~]# helm repo add flagger https://flagger.app
"flagger" has been added to your repositories
[root@m1 ~]#
創建 Flagger 的 crd:
[root@m1 ~]# kubectl apply -f https://raw.githubusercontent.com/fluxcd/flagger/main/artifacts/flagger/crd.yaml
[root@m1 ~]# kubectl get crd |grep flagger
alertproviders.flagger.app 2020-12-23T14:40:00Z
canaries.flagger.app 2020-12-23T14:40:00Z
metrictemplates.flagger.app 2020-12-23T14:40:00Z
[root@m1 ~]#
通過 Helm 把 Flagger 部署到 istio-system 命名空間下:
[root@m1 ~]# helm upgrade -i flagger flagger/flagger \
--namespace=istio-system \
--set crd.create=false \
--set meshProvider=istio \
--set metricsServer=http://prometheus.istio-system:9090
添加一個slack的hooks到flagger裏,可以讓flagger發送通知到slack頻道里,這一步是可選的:
[root@m1 ~]# helm upgrade -i flagger flagger/flagger \
--namespace=istio-system \
--set crd.create=false \
--set slack.url=https://hooks.slack.com/services/xxxxxx \
--set slack.channel=general \
--set slack.user=flagger
- slack webhooks的使用文檔:https://api.slack.com/messaging/webhooks
除了slack外,我們還可以爲flagger配置一個grafana,該grafana集成了一個canary dashboard,可以方便我們去查看灰度發佈的進度:
[root@m1 ~]# helm upgrade -i flagger-grafana flagger/grafana \
--namespace=istio-system \
--set url=http://prometheus.istio-system:9090 \
--set user=admin \
--set password=admin
以上操作完成後,確認下flagger的部署情況:
[root@m1 ~]# kubectl get pods -n istio-system
NAME READY STATUS RESTARTS AGE
flagger-b68b578b-5f8bh 1/1 Running 0 7m50s
flagger-grafana-77b8c8df65-7vv89 1/1 Running 0 71s
...
爲網格創建一個ingress網關:
[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: public-gateway
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*"
EOF
另外,我們還可以部署一個負載測試工具,當然這也是可選的:
[root@m1 ~]# kubectl create ns test
namespace/test created
[root@m1 ~]# kubectl apply -k https://github.com/fluxcd/flagger/tree/main/kustomize/tester
[root@m1 ~]# kubectl get pods -n test
NAME READY STATUS RESTARTS AGE
flagger-loadtester-64695f854f-5hsmg 1/1 Running 0 114s
[root@m1 ~]#
如果上面這種方式比較慢的話也可以將倉庫克隆下來,然後對 tester 進行部署:
[root@m1 ~]# cd /usr/local/src
[root@m1 /usr/local/src]# git clone https://github.com/fluxcd/flagger.git
[root@m1 /usr/local/src]# kubectl apply -k flagger/kustomize/tester/
灰度發佈配置
爲 httpbin 服務配置HAP,讓它可以支持動態伸縮,這也是可選的,但通常建議將HAP配置上:
[root@m1 ~]# kubectl apply -n demo -f - <<EOF
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: httpbin
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: httpbin
minReplicas: 2
maxReplicas: 4
metrics:
- type: Resource
resource:
name: cpu
# scale up if usage is above
# 99% of the requested CPU (100m)
targetAverageUtilization: 99
EOF
創建用於驗證灰度發佈的 metric ,falgger會根據該指標逐漸遷移流量:
[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: flagger.app/v1beta1
kind: MetricTemplate
metadata:
name: latency
namespace: istio-system
spec:
provider:
type: prometheus
address: http://prometheus.istio-system:9090
query: |
histogram_quantile(
0.99,
sum(
rate(
istio_request_duration_milliseconds_bucket{
reporter="destination",
destination_workload_namespace="{{ namespace }}",
destination_workload=~"{{ target }}"
}[{{ interval }}]
)
) by (le)
)
EOF
創建 flagger 的 canary,具體的配置內容如下,灰度發佈的相關配置信息都定義在這裏:
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: httpbin
namespace: demo
spec:
# deployment reference
targetRef:
apiVersion: apps/v1
kind: Deployment
name: httpbin
# the maximum time in seconds for the canary deployment
# to make progress before it is rollback (default 600s)
progressDeadlineSeconds: 60
# HPA reference (optional)
autoscalerRef:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
name: httpbin
service:
# service port number
port: 8000
# container port number or name (optional)
targetPort: 80
# Istio gateways (optional)
gateways:
- public-gateway.istio-system.svc.cluster.local
analysis:
# schedule interval (default 60s)
interval: 30s
# max number of failed metric checks before rollback
threshold: 5
# max traffic percentage routed to canary
# percentage (0-100)
maxWeight: 100
# canary increment step
# percentage (0-100)
stepWeight: 20
metrics:
- name: request-success-rate
# minimum req success rate (non 5xx responses)
# percentage (0-100)
thresholdRange:
min: 99
interval: 1m
- name: latency
templateRef:
name: latency
namespace: istio-system
# maximum req duration P99
# milliseconds
thresholdRange:
max: 500
interval: 30s
# testing (optional)
webhooks:
- name: load-test
url: http://flagger-loadtester.test/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://httpbin-canary.demo:8000/headers"
創建了 Canary 後,會發現它在集羣中自動爲 httpbin 創建了一些帶 primary 命名的資源,還會創建一個Virtual Service,其路由規則指向 httpbin-primary 和 httpbin-canary 服務:
[root@m1 ~]# kubectl get pods -n demo
NAME READY STATUS RESTARTS AGE
httpbin-74fb669cc6-6ztkg 2/2 Running 0 50s
httpbin-74fb669cc6-vfs4h 2/2 Running 0 38s
httpbin-primary-9cb49747-94s4z 2/2 Running 0 3m3s
httpbin-primary-9cb49747-xhpcg 2/2 Running 0 3m3s
sleep-854565cb79-mcmnb 2/2 Running 0 94m
[root@m1 ~]# kubectl get svc -n demo
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
httpbin ClusterIP 10.105.17.57 <none> 8000/TCP 86m
httpbin-canary ClusterIP 10.99.206.196 <none> 8000/TCP 3m14s
httpbin-primary ClusterIP 10.98.196.235 <none> 8000/TCP 3m14s
sleep ClusterIP 10.103.14.114 <none> 80/TCP 95m
[root@m1 ~]# kubectl get vs -n demo
NAME GATEWAYS HOSTS AGE
httpbin ["public-gateway.istio-system.svc.cluster.local"] ["httpbin"] 3m29s
[root@m1 ~]#
然後我們使用如下命令觸發灰度:
[root@m1 ~]# kubectl -n demo set image deployment/httpbin httpbin=httpbin-v2
deployment.apps/httpbin image updated
[root@m1 ~]#
- Tips:dep、configmap、secret 都會觸發
查看 canary 的事件,可以看到已經檢測到新版本了:
[root@m1 ~]# kubectl describe canary httpbin -n demo
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
...
Normal Synced 2m57s flagger New revision detected! Scaling up httpbin.demo
Warning Synced 27s (x5 over 2m27s) flagger canary deployment httpbin.demo not ready: waiting for rollout to finish: 1 out of 2 new replicas have been updated
此時查看 httpbin 的Virtual Service,會發現已經將20%的流量切換到灰度發佈版本了:
[root@m1 ~]# kubectl describe vs httpbin -n demo
...
Spec:
Gateways:
public-gateway.istio-system.svc.cluster.local
Hosts:
httpbin
Http:
Route:
Destination:
Host: httpbin-primary
Weight: 80
Destination:
Host: httpbin-canary
Weight: 20
Events: <none>
然後進入 sleep 服務中,使用腳本循環訪問 httpbin 服務:
[root@m1 ~]# kubectl exec -it -n demo sleep-854565cb79-mcmnb -c sleep -- sh
/ # while [ 1 ]; do curl http://httpbin.demo:8000/headers;sleep 2s; done
再次查看 httpbin 的Virtual Service,會發現已經將60%的流量切換到灰度發佈版本了:
[root@m1 ~]# kubectl describe vs httpbin -n demo
...
Spec:
Gateways:
public-gateway.istio-system.svc.cluster.local
Hosts:
httpbin
Http:
Route:
Destination:
Host: httpbin-primary
Weight: 40
Destination:
Host: httpbin-canary
Weight: 60
Events: <none>
我們可以打開flagger的Grafana:
[root@m1 ~]# kubectl -n istio-system port-forward svc/flagger-grafana 3000:80 --address 192.168.243.138
Forwarding from 192.168.243.138:3000 -> 3000
內置瞭如下dashboard:
在 Istio Canary Dashboard 可以查看發佈過程:
最終將100%的流量切換到灰度發佈版本代表發佈完成:
[root@m1 ~]# kubectl describe vs httpbin -n demo
...
Spec:
Gateways:
public-gateway.istio-system.svc.cluster.local
Hosts:
httpbin
Http:
Route:
Destination:
Host: httpbin-primary
Weight: 0
Destination:
Host: httpbin-canary
Weight: 100
Events: <none>
從 canary httpbin 的事件日誌中也可以看到流量遷移的過程:
[root@m1 ~]# kubectl describe canary httpbin -n demo
...
Normal Synced 3m44s (x2 over 18m) flagger New revision detected! Restarting analysis for httpbin.demo
Normal Synced 3m14s (x2 over 18m) flagger Starting canary analysis for httpbin.demo
Normal Synced 3m14s (x2 over 18m) flagger Advance httpbin.demo canary weight 20
Warning Synced 2m44s (x2 over 17m) flagger Halt advancement no values found for istio metric request-success-rate probably httpbin.demo is not receiving traffic: running query failed: no values found
Normal Synced 2m14s flagger Advance httpbin.demo canary weight 40
Normal Synced 104s flagger Advance httpbin.demo canary weight 60
Normal Synced 74s flagger Advance httpbin.demo canary weight 80
Normal Synced 44s flagger Advance httpbin.demo canary weight 100
當發佈完成後,canary httpbin 的狀態就會變更爲 Succeeded :
[root@m1 ~]# kubectl get canary -n demo
NAME STATUS WEIGHT LASTTRANSITIONTIME
httpbin Succeeded 0 2020-12-23T16:03:04Z
[root@m1 ~]#
提升系統的彈性能力
彈性設計目前在很多領域都很流行,例如環境景觀設計中的彈性是指具有一定的災後恢復能力,但災難發生之後景觀可以快速地恢復它的結構和功能。在產品設計中,一般彈性是指對產品形態特徵等設計時,留有一定的餘地,方便修改。
分佈式系統中的彈性一般是指讓系統具有一定的容錯能力和應對能力,在故障發生時能夠快速恢復,能夠應對故障。本小節我們就來爲之前部署的示例應用增加一些彈性能力。
系統可用性度量
我們先來了解一個概念:服務級別協議(SLA – Service Level Agreement)。服務級別協議是指提供服務的企業與客戶之間就服務的品質、水準、性能等方面所達成的雙方共同認可的協議或契約。 例如通常一個服務的提供商都會跟客戶保證自己的服務具有什麼級別的可用性,也就是我們平時說的幾個9的可用性級別。
系統的可用性計算公式:
常見的可用性級別如下:
彈性設計
- 應對故障的一種方法,就是讓系統具有容錯和適應能力
- 防止故障(Fault)轉化爲失敗(Failure)
- 主要包括:
- 容錯性:重試、冪等
- 伸縮性:自動水平擴展(autoscaling)
- 過載保護:超時、熔斷、降級、限流
- 彈性測試:故障注入
Istio 所提供的彈性能力:
- 超時
- 重試
- 熔斷
- 故障注入
爲 demo 應用提供彈性能力
首先,我們爲 demo 應用創建一個Virtual Service:
[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: httpbin
namespace: demo
spec:
hosts:
- "*"
gateways:
- httpbin-gateway
http:
- route:
- destination:
host: httpbin
port:
number: 8000
EOF
添加第一個彈性能力:配置超時,配置如下所示:
[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: httpbin
namespace: demo
spec:
hosts:
- "*"
gateways:
- httpbin-gateway
http:
- route:
- destination:
host: httpbin
port:
number: 8000
timeout: 1s # 配置超時
EOF
超時配置規則:
- timeout & retries.perTryTimout 同時存在時
- 超時生效 =
min (timeout, retry.perTryTimout * retry.attempts)
在超時的基礎上,我們還可以配置重試策略:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: httpbin
namespace: demo
spec:
hosts:
- "*"
gateways:
- httpbin-gateway
http:
- route:
- destination:
host: httpbin
port:
number: 8000
retry: # 配置重試策略
attempts: 3
perTryTimeout: 1s
timeout: 8s
重試配置項:
x-envoy-retry-on
:5xx, gateway-error, reset, connect-failure…x-envoy-retry-grpc-on
:cancelled, deadline-exceeded, internal, unavailable…
配置熔斷:
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: httpbin
namespace: demo
spec:
host: httpbin
trafficPolicy:
connectionPool:
tcp:
maxConnections: 1
http:
http1MaxPendingRequests: 1
maxRequestsPerConnection: 1
outlierDetection:
consecutiveErrors: 1
interval: 1s
baseEjectionTime: 3m
maxEjectionPercent: 100
熔斷配置:
- TCP 和 HTTP 連接池大小爲 1
- 只容許出錯 1 次
- 每秒 1 次請求計數
- 可以從負載池中移除全部 pod
- 發生故障的 pod 移除 3m 之後才能再次加入
配置安全策略
Istio 的安全解決方案
Istio 安全架構
實戰
對特定的服務(httpbin)創建授權,注意沒有配置rule,表示deny當前服務:
[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: httpbin
namespace: demo
spec:
selector:
matchLabels:
app: httpbin
EOF
以上配置的意思就是對於這個服務完全不可訪問,我們可以測試一下:
# 請求被拒絕
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/get"
RBAC: access denied # 響應
# 其他版本可以正常訪問
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin-v2.demo:8000/get"
我們可以通過如下配置對請求來源進行限定,例如請求來源必須是 demo 這個命名空間:
[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: httpbin
namespace: demo
spec:
action: ALLOW
rules:
- from:
- source:
principals: ["cluster.local/ns/demo/sa/sleep"]
- source:
namespaces: ["demo"]
EOF
測試:
# 請求通過
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/get"
# 請求被拒絕
$ kubectl exec -it -n ${other_namespace} ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/get"
# 修改service account爲${other_namespace}後,通過
除了限定請求來源外,還可以限定只有特定的接口允許被訪問:
[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: httpbin
namespace: demo
spec:
action: ALLOW
rules:
- from:
- source:
principals: ["cluster.local/ns/demo/sa/sleep"]
- source:
namespaces: ["demo"]
to:
- operation:
methods: ["GET"]
paths: ["/get"]
EOF
測試:
# 請求通過
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/get"
# 請求被拒絕
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/ip"
還可以配置其他特定條件,例如限定請求頭,通常用於我們需要客戶端攜帶特定的請求頭才允許訪問接口的場景:
[root@m1 ~]# kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: httpbin
namespace: demo
spec:
action: ALLOW
rules:
- from:
- source:
principals: ["cluster.local/ns/demo/sa/sleep"]
- source:
namespaces: ["demo"]
to:
- operation:
methods: ["GET"]
paths: ["/get"]
when:
- key: request.headers[x-rfma-token]
values: ["test*"]
EOF
測試:
# 請求不通過
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/get"
# 加token後通過
$ kubectl exec -it -n demo ${sleep_pod_name} -c sleep -- curl "http://httpbin.demo:8000/get" -H x-rfma-token:test1
下篇: