一、概述
Swarm 是使用 SwarmKit 構建的 Docker 引擎內置(原生)的集羣管理和編排工具。Swarm 集羣由 管理節點 和 工作節點 組成。
本篇使用的環境包括3個節點,一個作爲Swarm的manager節點,兩個爲worker節點,機器名和IP地址如下:
- wuweixiang: 139.9.44.81 (Swarm manager)
- VM_0_14_centos: 188.131.152.100 (Swarm worker)
- VM_38_55_centos: 140.143.206.99 (Swarm worker)
二、初始化Swarm集羣
# 初始化一個集羣 [root@wuweixiang ~]# docker swarm init --help Usage: docker swarm init [OPTIONS] Initialize a swarm Options: --advertise-addr string Advertised address (format: <ip|interface>[:port]) --autolock Enable manager autolocking (requiring an unlock key to start a stopped manager) --availability string Availability of the node ("active"|"pause"|"drain") (default "active") --cert-expiry duration Validity period for node certificates (ns|us|ms|s|m|h) (default 2160h0m0s) --data-path-addr string Address or interface to use for data path traffic (format: <ip|interface>) --default-addr-pool ipNetSlice default address pool in CIDR format (default []) --default-addr-pool-mask-length uint32 default address pool subnet mask length (default 24) --dispatcher-heartbeat duration Dispatcher heartbeat period (ns|us|ms|s|m|h) (default 5s) --external-ca external-ca Specifications of one or more certificate signing endpoints --force-new-cluster Force create a new cluster from current state --listen-addr node-addr Listen address (format: <ip|interface>[:port]) (default 0.0.0.0:2377) --max-snapshots uint Number of additional Raft snapshots to retain --snapshot-interval uint Number of log entries between Raft snapshots (default 10000) --task-history-limit int Task history retention limit (default 5) # Master - > 初始化一個集羣, 創建swarm管理節點 [root@wuweixiang ~]# docker swarm init --advertise-addr 139.9.44.81 Swarm initialized: current node (xvmqc3op6e9lkao153u410m8x) is now a manager. To add a worker to this swarm, run the following command: docker swarm join --token SWMTKN-1-255nm4msqjuij5q0phuhy25ptz4m1qw7rfdbhwv4rbjl0ftg4j-0moyoy6mn3i4ewpaqh5wqrdq4 139.9.44.81:2377 To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions. # Master - > 查看Worker節點連接所需要的Token信息 [root@wuweixiang ~]# docker swarm join-token worker To add a worker to this swarm, run the following command: docker swarm join --token SWMTKN-1-255nm4msqjuij5q0phuhy25ptz4m1qw7rfdbhwv4rbjl0ftg4j-0moyoy6mn3i4ewpaqh5wqrdq4 139.9.44.81:2377 # 使用docker info查看集羣中的相關信息 [root@wuweixiang ~]# docker info …… Swarm: active NodeID: xvmqc3op6e9lkao153u410m8x Is Manager: true ClusterID: oucnrveg187xttygnm6fak4di Managers: 1 Nodes: 3 Default Address Pool: 10.0.0.0/8 SubnetSize: 24 Orchestration: Task History Retention Limit: 5 …… # Master - > docker node ls 查看集羣 [root@wuweixiang ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION i3ma0jg3a0dzqezh1tjbwyxxk VM_0_14_centos Ready Active 18.09.0 nglbqs945y4p7t57yvnhiryze VM_38_55_centos Ready Active 18.09.0 xvmqc3op6e9lkao153u410m8x * wuweixiang Ready Active Leader 18.09.0
node ID旁邊那個*號表示現在連接到這個節點上。
三、將Worker節點加入Swarm集羣
[root@VM_0_14_centos ~]# docker swarm join --token SWMTKN-1-255nm4msqjuij5q0phuhy25ptz4m1qw7rfdbhwv4rbjl0ftg4j-0moyoy6mn3i4ewpaqh5wqrdq4 139.9.44.81:2377 This node joined a swarm as a worker.
四、管理Swarm集羣
1、刪除Swarm集羣節點
[root@VM_0_14_centos ~]# docker swarm leave Node left the swarm.
[root@wuweixiang ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION i3ma0jg3a0dzqezh1tjbwyxxk VM_0_14_centos Down Active 18.09.0 nglbqs945y4p7t57yvnhiryze VM_38_55_centos Ready Active 18.09.0 xvmqc3op6e9lkao153u410m8x * wuweixiang Ready Active Leader 18.09.0 [root@wuweixiang ~]# docker node rm --force i3 i3 [root@wuweixiang ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION nglbqs945y4p7t57yvnhiryze VM_38_55_centos Ready Active 18.09.0 xvmqc3op6e9lkao153u410m8x * wuweixiang Ready Active Leader 18.09.0
2、更新Swarm集羣節點
[root@wuweixiang ~]# docker swarm update Usage: docker swarm update [OPTIONS] Update the swarm Options: --autolock Change manager autolocking setting (true|false) --cert-expiry duration Validity period for node certificates (ns|us|ms|s|m|h) (default 2160h0m0s) --dispatcher-heartbeat duration Dispatcher heartbeat period (ns|us|ms|s|m|h) (default 5s) --external-ca external-ca Specifications of one or more certificate signing endpoints --max-snapshots uint Number of additional Raft snapshots to retain --snapshot-interval uint Number of log entries between Raft snapshots (default 10000) --task-history-limit int Task history retention limit (default 5)
五、Swarm集羣的服務部署實踐
1 在Swarm中部署服務
在wuweixiang也就是manager節點上運行如下命令來部署服務:
[root@wuweixiang ~]# docker service create --replicas 1 --name helloworld alpine ping docker.com
參數說明:
--replicas
參數指定啓動的服務由幾個實例組成;--name
參數指定啓動服務的服務名;alpine ping docker.com
指定了使用alpine鏡像創建服務,實例啓動時運行ping docker.com命令。
這與docker run命令是一樣的。
使用docker service ls
查看正在運行服務的列表:
[root@wuweixiang ~]# docker service ls ID NAME MODE REPLICAS IMAGE PORTS iswutf06uqkm helloworld replicated 1/1 alpine:latest
2 查詢Swarm中服務的信息
在部署了服務之後,登錄到manager節點,運行下面的命令來顯示服務的信息。參數--pretty
使命令輸出格式化爲可讀的格式,不加--pretty
可以輸出更詳細的信息:
[root@wuweixiang ~]# docker service inspect --pretty helloworld ID: yzq3e2aqp81d2a63nxizaadh4 Name: helloworld Service Mode: Replicated Replicas: 1 Placement: UpdateConfig: Parallelism: 1 On failure: pause Monitoring Period: 5s Max failure ratio: 0 Update order: stop-first RollbackConfig: Parallelism: 1 On failure: pause Monitoring Period: 5s Max failure ratio: 0 Rollback order: stop-first ContainerSpec: Image: alpine:latest@sha256:621c2f39f8133acb8e64023a94dbdf0d5ca81896102b9e57c0dc184cadaf5528 Args: ping docker.com Init: false Resources: Endpoint Mode: vip
使用命令docker service ps <SERVICE-ID>
可以查詢到哪個節點正在運行該服務:
[root@wuweixiang ~]# docker service ps yz ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS qu7wc6cv3rlj helloworld.1 alpine:latest wuweixiang Running Running 56 seconds ago
3 在Swarm中動態擴展服務
登錄到manager節點,使用命令docker service scale <SERVICE-ID>=<NUMBER-OF-TASKS>
來將服務擴展到指定的實例數:
[root@wuweixiang ~]# docker service scale helloworld=5 helloworld scaled to 5 overall progress: 5 out of 5 tasks 1/5: running [==================================================>] 2/5: running [==================================================>] 3/5: running [==================================================>] 4/5: running [==================================================>] 5/5: running [==================================================>] verify: Service converged [root@wuweixiang ~]# docker service ls ID NAME MODE REPLICAS IMAGE PORTS yzq3e2aqp81d helloworld replicated 5/5 alpine:latest [root@wuweixiang ~]# docker service ps yz ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS qu7wc6cv3rlj helloworld.1 alpine:latest wuweixiang Running Running about a minute ago lry00fiqgmw3 helloworld.2 alpine:latest VM_0_14_centos Running Running 19 seconds ago w4wvcghx57cn helloworld.3 alpine:latest VM_0_14_centos Running Running 19 seconds ago gi84iozh8vxh helloworld.4 alpine:latest wuweixiang Running Running 19 seconds ago kcf09wi7vzj2 helloworld.5 alpine:latest VM_38_55_centos Running Running 19 seconds ago
可見Swarm集羣創建了4個新的task來將整個服務的實例數擴展到5個。這些服務分佈在不同的Swarm節點上。
4 刪除Swarm集羣中的服務
在manager節點上運行docker service rm helloworld
便可以將服務刪除。刪除服務時,會將服務在各個節點上創建的容器一同刪除,而並不是將容器停止。
此外Swarm模式還提供了服務的滾動升級,將某個worker置爲維護模式,及路由網等功能。在Docker將Swarm集成進Docker引擎後,可以使用原生的Docker CLI對容器集羣進行各種操作,使集羣的部署更加方便、快捷。
5 更新Swarm集羣中的服務版本
在前面的步驟中, 我們擴展了一個服務的多個實例, 如上所示, 我們擴展了基於Tomcat Server 8.5.8的Docker鏡像。 假如,現在我們需要使用Tomcat Server 8.6.0版本做爲Docker容器版本來替換原有的Tomcat Server 8.5.8版本。
[root@centos7-Master ~]# docker service update --image tomcat:8.6.0 tomcat-service
tomcat-service
服務版本更新計劃將按以下步驟執行:
重新啓動一個暫停更新的服務, 可以使用docker service update <SERVICE-ID>
命令, 例如:
[root@centos7-Master ~]# docker service update tomcat-service
- 在Swarm集羣中的Manager節點上執行操作,用於完成服務版本的更新。
- 停止第一個任務
- 計劃對已停止任務的更新
- 啓動已更新任務的容器
- 如果任務更新返回“RUNNING”狀態,等待指定的延遲時間後,停止下一個任務
- 如果在任務更新時,任務返回“FAILED”狀態,將會暫停更新。
- 查看服務版本更新結果
[root@centos7-Master ~]# docker service ps tomcat-service
6 停用Swarm集羣中的服務節點
如果我們想要停止Swarm集羣中某個服務的Worker節點, 我們可以使用docker node update --availability drain <Node-ID>
來停止Worker節點上的服務。
[root@wuweixiang ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION v4zv3v207si2pvofuk9ma032h VM_0_14_centos Ready Active 18.09.0 nglbqs945y4p7t57yvnhiryze VM_38_55_centos Ready Active 18.09.0 xvmqc3op6e9lkao153u410m8x * wuweixiang Ready Active Leader 18.09.0 [root@wuweixiang ~]# docker node update --availability drain v4 v4 [root@wuweixiang ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION v4zv3v207si2pvofuk9ma032h VM_0_14_centos Ready Drain 18.09.0 nglbqs945y4p7t57yvnhiryze VM_38_55_centos Ready Active 18.09.0 xvmqc3op6e9lkao153u410m8x * wuweixiang Ready Active Leader 18.09.0
在停止Worker節點上的服務後, 我們可以通過docker node inspect --pretty <Node-ID>
查看節點狀態。
[root@wuweixiang ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
i3ma0jg3a0dzqezh1tjbwyxxk VM_0_14_centos Ready Drain 18.09.0
xvmqc3op6e9lkao153u410m8x * wuweixiang Ready Active Leader 18.09.0
[root@wuweixiang ~]# docker node inspect --pretty v4
ID: v4zv3v207si2pvofuk9ma032h
Hostname: VM_0_14_centos
Joined at: 2018-12-11 11:37:15.973392656 +0000 utc
Status:
State: Ready
Availability: Drain
Address: 188.131.152.100
Platform:
Operating System: linux
Architecture: x86_64
Resources:
CPUs: 1
Memory: 992.7MiB
Plugins:
Log: awslogs, fluentd, gcplogs, gelf, journald, json-file, local, logentries, splunk, syslog
Network: bridge, host, macvlan, null, overlay
Volume: local
Engine Version: 18.09.0
TLS Info:
TrustRoot:
-----BEGIN CERTIFICATE-----
MIIBajCCARCgAwIBAgIUaoragJW4UwMO+DCs1zkxpt1xPdswCgYIKoZIzj0EAwIw
EzERMA8GA1UEAxMIc3dhcm0tY2EwHhcNMTgxMjExMDc0MTAwWhcNMzgxMjA2MDc0
MTAwWjATMREwDwYDVQQDEwhzd2FybS1jYTBZMBMGByqGSM49AgEGCCqGSM49AwEH
A0IABFLvlDlCVuPyAbqMCKIl4MAdVfvgYLvoAIbkzX0EPPdlB5jiVR2oI6xSmWHg
Yt5mivr+b0eRVg17RneCz/zJjgWjQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNVHRMB
Af8EBTADAQH/MB0GA1UdDgQWBBSoVH4AOp4ATVDNzsnA/8aP/Qx2aDAKBggqhkjO
PQQDAgNIADBFAiARza3fA5h4sFguVfiFEE4JYputzRyZ3CdvfUoR2DNK3QIhAM6j
5WCUR5syguW3xhFRpuQqgztsekBAjoUakQD7mSu/
-----END CERTIFICATE-----
Issuer Subject: MBMxETAPBgNVBAMTCHN3YXJtLWNh
Issuer Public Key: MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEUu+UOUJW4/IBuowIoiXgwB1V++Bgu+gAhuTNfQQ892UHmOJVHagjrFKZYeBi3maK+v5vR5FWDXtGd4LP/MmOBQ==
使用docker service ps tomcat-service
查看當前helloworld啓動的集羣信息。
[root@wuweixiang ~]# docker service ps helloworld ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS qu7wc6cv3rlj helloworld.1 alpine:latest wuweixiang Running Running 4 minutes ago kmbmtq98pqxm helloworld.2 alpine:latest VM_38_55_centos Running Running about a minute ago lry00fiqgmw3 \_ helloworld.2 alpine:latest VM_0_14_centos Shutdown Shutdown about a minute ago 49ip45gz71q3 helloworld.3 alpine:latest VM_38_55_centos Running Running about a minute ago w4wvcghx57cn \_ helloworld.3 alpine:latest VM_0_14_centos Shutdown Shutdown about a minute ago gi84iozh8vxh helloworld.4 alpine:latest wuweixiang Running Running 3 minutes ago kcf09wi7vzj2 helloworld.5 alpine:latest VM_38_55_centos Running Running 3 minutes ago
如果我們需要重新啓用VM_0_14_centos 的Swarm集羣服務, 我們可以通過docker node update --availability active <NODE-ID>
來實現對服務節點的啓用。
[root@wuweixiang ~]# docker node update --availability active v4 v4 [root@wuweixiang ~]# docker node inspect --pretty v4 ID: v4zv3v207si2pvofuk9ma032h Hostname: VM_0_14_centos Joined at: 2018-12-11 11:37:15.973392656 +0000 utc Status: State: Ready Availability: Active Address: 188.131.152.100 Platform: Operating System: linux Architecture: x86_64 Resources: CPUs: 1 Memory: 992.7MiB Plugins: Log: awslogs, fluentd, gcplogs, gelf, journald, json-file, local, logentries, splunk, syslog Network: bridge, host, macvlan, null, overlay Volume: local Engine Version: 18.09.0 TLS Info: TrustRoot: -----BEGIN CERTIFICATE----- MIIBajCCARCgAwIBAgIUaoragJW4UwMO+DCs1zkxpt1xPdswCgYIKoZIzj0EAwIw EzERMA8GA1UEAxMIc3dhcm0tY2EwHhcNMTgxMjExMDc0MTAwWhcNMzgxMjA2MDc0 MTAwWjATMREwDwYDVQQDEwhzd2FybS1jYTBZMBMGByqGSM49AgEGCCqGSM49AwEH A0IABFLvlDlCVuPyAbqMCKIl4MAdVfvgYLvoAIbkzX0EPPdlB5jiVR2oI6xSmWHg Yt5mivr+b0eRVg17RneCz/zJjgWjQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNVHRMB Af8EBTADAQH/MB0GA1UdDgQWBBSoVH4AOp4ATVDNzsnA/8aP/Qx2aDAKBggqhkjO PQQDAgNIADBFAiARza3fA5h4sFguVfiFEE4JYputzRyZ3CdvfUoR2DNK3QIhAM6j 5WCUR5syguW3xhFRpuQqgztsekBAjoUakQD7mSu/ -----END CERTIFICATE----- Issuer Subject: MBMxETAPBgNVBAMTCHN3YXJtLWNh Issuer Public Key: MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEUu+UOUJW4/IBuowIoiXgwB1V++Bgu+gAhuTNfQQ892UHmOJVHagjrFKZYeBi3maK+v5vR5FWDXtGd4LP/MmOBQ==
當我們設置Swarm集羣的Worker節點爲可用時,它便能接收新的任務:
- 當服務需要進行擴展時
- 當對服務的版本進行更新時
- 當我們對停用另外一個Swarm集羣節點時
- 當任務在另外一個活動狀態節點出現失敗時
參考garyond:https://www.jianshu.com/p/df744c4e375e