背景:
計劃使用Centos7 Core 作爲 docker host,搭建測試平臺,只有4臺主機,就使用Swarm方案。
環境:
主機名稱 | 系統 | IP地址 | 前置工作 |
docker-host1 | CentOS7 Core | xxx.xxx.xxx.80 |
更新yum源,使用aliyun鏡像,更新到最新狀態 配置docker-machine 使用的用戶,sudo免密碼 |
docker-host2 | CentOS7 Core | xxx.xxx.xxx.81 |
更新yum源,使用aliyun鏡像,更新到最新狀態 配置docker-machine 使用的用戶,sudo免密碼 |
docker-host3 | CentOS7 Core | xxx.xxx.xxx.82 |
更新yum源,使用aliyun鏡像,更新到最新狀態 配置docker-machine 使用的用戶,sudo免密碼 |
docker-host4 | CentOS7 Core | xxx.xxx.xxx.83 |
更新yum源,使用aliyun鏡像,更新到最新狀態 配置docker-machine 使用的用戶,sudo免密碼 |
desktop | ubuntu-18.04lts | 配置免祕鑰登錄到docker-host |
安裝:
在工作desktop(Ubuntu)主機上安裝 docker-machine,可以參考https://docs.docker.com/machine/install-machine/
首先安裝 docker-host1:
docker-machine --debug create --driver generic --generic-ip-address=xxx.xxx.xxx.80 --generic-ssh-key=/home/sleeber/.ssh/id_rsa --generic-ssh-port=22 --generic-ssh-user=wntime docker-host1
安裝最後輸出,顯示docker 服務無法啓動:
sudo systemctl -f start docker
SSH cmd err, output: exit status 1: Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
Error creating machine: Error running provisioning: something went wrong running an SSH command
command : sudo systemctl -f start docker
err : exit status 1
output : Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
notifying bugsnag: [Error creating machine: Error running provisioning: something went wrong running an SSH command
command : sudo systemctl -f start docker
err : exit status 1
output : Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
]
使用docker-machine ls 查看主機狀態:
:~/docker-test-env$ docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
docker-host1 - generic Running tcp://xxx.xxx.xxx.80:2376 Unknown Unable to query docker version: Cannot connect to the docker engine endpoint
SSH到docker-host1 主機,手動啓動docker服務
sudo systemctl start docker
Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
按照提示查看狀態
systemctl status docker.service
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/docker.service.d
└─10-machine.conf
Active: failed (Result: start-limit) since Thu 2020-05-28 20:41:39 EDT; 581ms ago
Docs: https://docs.docker.com
Process: 30463 ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver overlay2 --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=generic (code=exited, status=1/FAILURE)
Main PID: 30463 (code=exited, status=1/FAILURE)
谷歌一下:
https://github.com/moby/moby/issues/33931
https://www.jianshu.com/p/bd395fdf7611
https://www.jianshu.com/p/93518610eea1
按照搜索結果看,應該是10-machine.conf引起;但是刪除了10-machine.conf文件,及文件夾還是無法啓動
$ sudo rm /etc/systemd/system/docker.service.d/10-machine.conf
$ sudo systemctl start docker
Failed to start docker.service: Unit is not loaded properly: Invalid argument.
See system logs and 'systemctl status docker.service' for details.
$ sudo rm -rf /etc/systemd/system/docker.service.d/
$ sudo systemctl start docker
Failed to start docker.service: Unit is not loaded properly: Invalid argument.
See system logs and 'systemctl status docker.service' for details.
使用dockerd 直接能啓動
$ sudo nohup dockerd &
[1] 31419
$ nohup: ignoring input and appending output to ‘nohup.out’
$ ll
total 4
-rw-------. 1 root root 2638 May 28 20:52 nohup.out
$ tail -f nohup.out
tail: cannot open ‘nohup.out’ for reading: Permission denied
tail: no files remaining
$ sudo tail -f nohup.out
time="2020-05-28T20:52:56.409646032-04:00" level=warning msg="Base device already exists and has filesystem xfs on it. User specified filesystem will be ignored." storage-driver=devicemapper
time="2020-05-28T20:52:56.430282474-04:00" level=info msg="[graphdriver] using prior storage driver: devicemapper"
time="2020-05-28T20:52:56.430326847-04:00" level=warning msg="[graphdriver] WARNING: the devicemapper storage-driver is deprecated, and will be removed in a future release"
time="2020-05-28T20:52:56.433336530-04:00" level=warning msg="mountpoint for pids not found"
time="2020-05-28T20:52:56.433595847-04:00" level=info msg="Loading containers: start."
time="2020-05-28T20:52:56.529329426-04:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
time="2020-05-28T20:52:56.558933143-04:00" level=info msg="Loading containers: done."
time="2020-05-28T20:52:56.570819898-04:00" level=info msg="Docker daemon" commit=9d988398e7 graphdriver(s)=devicemapper version=19.03.9
time="2020-05-28T20:52:56.570873940-04:00" level=info msg="Daemon has completed initialization"
time="2020-05-28T20:52:56.585131361-04:00" level=info msg="API listen on /var/run/docker.sock"
^C
$ sudo docker version
Client: Docker Engine - Community
Version: 19.03.9
API version: 1.40
Go version: go1.13.10
Git commit: 9d988398e7
Built: Fri May 15 00:25:27 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.9
API version: 1.40 (minimum version 1.12)
Go version: go1.13.10
Git commit: 9d988398e7
Built: Fri May 15 00:24:05 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.13
GitCommit: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
使用systemctl 查看docker服務狀態還是顯示失敗
$ sudo systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/docker.service.d
└─10-machine.conf
Active: failed (Result: start-limit) since Thu 2020-05-28 20:49:52 EDT; 4min 59s ago
Docs: https://docs.docker.com
Process: 31051 ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver overlay2 --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=generic (code=exited, status=1/FAILURE)
Main PID: 31051 (code=exited, status=1/FAILURE)
May 28 20:49:50 test-java2 systemd[1]: docker.service failed.
May 28 20:49:52 test-java2 systemd[1]: docker.service holdoff time over, scheduling restart.
May 28 20:49:52 test-java2 systemd[1]: Stopped Docker Application Container Engine.
May 28 20:49:52 test-java2 systemd[1]: start request repeated too quickly for docker.service
May 28 20:49:52 test-java2 systemd[1]: Failed to start Docker Application Container Engine.
May 28 20:49:52 test-java2 systemd[1]: Unit docker.service entered failed state.
May 28 20:49:52 test-java2 systemd[1]: docker.service failed.
May 28 20:50:45 test-java2 systemd[1]: start request repeated too quickly for docker.service
May 28 20:50:45 test-java2 systemd[1]: Failed to start Docker Application Container Engine.
May 28 20:50:45 test-java2 systemd[1]: docker.service failed.
推斷 docker-machine 安裝docker還是有些問題,只能手動安裝
注意:卸載docker後,需要手動刪除/etc/systemd/system/docker.service.d/ 文件夾,否則新安裝的docker也是不能啓動的。
卸載現有docker,並重新安裝
sudo yum remove docker* \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-engine
sudo rm -rf /etc/systemd/system/docker.service.d/
sudo yum install -y yum-utils device-mapper-persistent-data lvm2
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install -y docker-ce docker-ce-cli containerd.io
啓動docker,並查看docker服務狀態
$ sudo docker version
Client: Docker Engine - Community
Version: 19.03.10
API version: 1.40
Go version: go1.13.10
Git commit: 9424aeaee9
Built: Thu May 28 22:18:06 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.10
API version: 1.40 (minimum version 1.12)
Go version: go1.13.10
Git commit: 9424aeaee9
Built: Thu May 28 22:16:43 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.13
GitCommit: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
可以正常啓動了