Centos7 docker-machine 部署docker主機 docker服務啓動失敗

背景:

        計劃使用Centos7 Core 作爲 docker host,搭建測試平臺,只有4臺主機,就使用Swarm方案。

環境:

主機名稱 系統 IP地址 前置工作
docker-host1 CentOS7 Core xxx.xxx.xxx.80

更新yum源,使用aliyun鏡像,更新到最新狀態

配置docker-machine 使用的用戶,sudo免密碼

docker-host2 CentOS7 Core xxx.xxx.xxx.81

更新yum源,使用aliyun鏡像,更新到最新狀態

配置docker-machine 使用的用戶,sudo免密碼

docker-host3 CentOS7 Core xxx.xxx.xxx.82

更新yum源,使用aliyun鏡像,更新到最新狀態

配置docker-machine 使用的用戶,sudo免密碼

docker-host4 CentOS7 Core xxx.xxx.xxx.83

更新yum源,使用aliyun鏡像,更新到最新狀態

配置docker-machine 使用的用戶,sudo免密碼

desktop ubuntu-18.04lts   配置免祕鑰登錄到docker-host

安裝:

在工作desktop(Ubuntu)主機上安裝 docker-machine,可以參考https://docs.docker.com/machine/install-machine/

首先安裝 docker-host1:

docker-machine --debug create --driver generic --generic-ip-address=xxx.xxx.xxx.80   --generic-ssh-key=/home/sleeber/.ssh/id_rsa --generic-ssh-port=22 --generic-ssh-user=wntime docker-host1

安裝最後輸出,顯示docker 服務無法啓動:

sudo systemctl -f start docker
SSH cmd err, output: exit status 1: Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.

Error creating machine: Error running provisioning: something went wrong running an SSH command
command : sudo systemctl -f start docker
err     : exit status 1
output  : Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.

notifying bugsnag: [Error creating machine: Error running provisioning: something went wrong running an SSH command
command : sudo systemctl -f start docker
err     : exit status 1
output  : Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
]

使用docker-machine ls 查看主機狀態:

:~/docker-test-env$ docker-machine ls
NAME         ACTIVE   DRIVER    STATE     URL                      SWARM   DOCKER    ERRORS
docker-host1   -        generic   Running   tcp://xxx.xxx.xxx.80:2376           Unknown   Unable to query docker version: Cannot connect to the docker engine endpoint

SSH到docker-host1 主機,手動啓動docker服務

sudo systemctl start docker
Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.

按照提示查看狀態

 systemctl status docker.service
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/docker.service.d
           └─10-machine.conf
   Active: failed (Result: start-limit) since Thu 2020-05-28 20:41:39 EDT; 581ms ago
     Docs: https://docs.docker.com
  Process: 30463 ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver overlay2 --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=generic (code=exited, status=1/FAILURE)
 Main PID: 30463 (code=exited, status=1/FAILURE)

谷歌一下:

https://github.com/moby/moby/issues/33931

https://www.jianshu.com/p/bd395fdf7611

https://www.jianshu.com/p/93518610eea1

按照搜索結果看,應該是10-machine.conf引起;但是刪除了10-machine.conf文件,及文件夾還是無法啓動

$ sudo rm /etc/systemd/system/docker.service.d/10-machine.conf
$ sudo systemctl start docker
Failed to start docker.service: Unit is not loaded properly: Invalid argument.
See system logs and 'systemctl status docker.service' for details.
$ sudo rm -rf /etc/systemd/system/docker.service.d/
$ sudo systemctl start docker
Failed to start docker.service: Unit is not loaded properly: Invalid argument.
See system logs and 'systemctl status docker.service' for details.

使用dockerd 直接能啓動

$ sudo nohup dockerd &
[1] 31419
$ nohup: ignoring input and appending output to ‘nohup.out’

$ ll
total 4
-rw-------. 1 root root 2638 May 28 20:52 nohup.out
$ tail -f nohup.out
tail: cannot open ‘nohup.out’ for reading: Permission denied
tail: no files remaining
$ sudo tail -f nohup.out
time="2020-05-28T20:52:56.409646032-04:00" level=warning msg="Base device already exists and has filesystem xfs on it. User specified filesystem  will be ignored." storage-driver=devicemapper
time="2020-05-28T20:52:56.430282474-04:00" level=info msg="[graphdriver] using prior storage driver: devicemapper"
time="2020-05-28T20:52:56.430326847-04:00" level=warning msg="[graphdriver] WARNING: the devicemapper storage-driver is deprecated, and will be removed in a future release"
time="2020-05-28T20:52:56.433336530-04:00" level=warning msg="mountpoint for pids not found"
time="2020-05-28T20:52:56.433595847-04:00" level=info msg="Loading containers: start."
time="2020-05-28T20:52:56.529329426-04:00" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
time="2020-05-28T20:52:56.558933143-04:00" level=info msg="Loading containers: done."
time="2020-05-28T20:52:56.570819898-04:00" level=info msg="Docker daemon" commit=9d988398e7 graphdriver(s)=devicemapper version=19.03.9
time="2020-05-28T20:52:56.570873940-04:00" level=info msg="Daemon has completed initialization"
time="2020-05-28T20:52:56.585131361-04:00" level=info msg="API listen on /var/run/docker.sock"
^C
$ sudo docker version
Client: Docker Engine - Community
 Version:           19.03.9
 API version:       1.40
 Go version:        go1.13.10
 Git commit:        9d988398e7
 Built:             Fri May 15 00:25:27 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.9
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.10
  Git commit:       9d988398e7
  Built:            Fri May 15 00:24:05 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.13
  GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

使用systemctl 查看docker服務狀態還是顯示失敗

$ sudo systemctl status docker
● docker.service - Docker Application Container Engine
   Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/docker.service.d
           └─10-machine.conf
   Active: failed (Result: start-limit) since Thu 2020-05-28 20:49:52 EDT; 4min 59s ago
     Docs: https://docs.docker.com
  Process: 31051 ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver overlay2 --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=generic (code=exited, status=1/FAILURE)
 Main PID: 31051 (code=exited, status=1/FAILURE)

May 28 20:49:50 test-java2 systemd[1]: docker.service failed.
May 28 20:49:52 test-java2 systemd[1]: docker.service holdoff time over, scheduling restart.
May 28 20:49:52 test-java2 systemd[1]: Stopped Docker Application Container Engine.
May 28 20:49:52 test-java2 systemd[1]: start request repeated too quickly for docker.service
May 28 20:49:52 test-java2 systemd[1]: Failed to start Docker Application Container Engine.
May 28 20:49:52 test-java2 systemd[1]: Unit docker.service entered failed state.
May 28 20:49:52 test-java2 systemd[1]: docker.service failed.
May 28 20:50:45 test-java2 systemd[1]: start request repeated too quickly for docker.service
May 28 20:50:45 test-java2 systemd[1]: Failed to start Docker Application Container Engine.
May 28 20:50:45 test-java2 systemd[1]: docker.service failed.

推斷 docker-machine 安裝docker還是有些問題,只能手動安裝

注意:卸載docker後,需要手動刪除/etc/systemd/system/docker.service.d/ 文件夾,否則新安裝的docker也是不能啓動的。

卸載現有docker,並重新安裝

sudo yum remove docker* \
                  docker-client \
                  docker-client-latest \
                  docker-common \
                  docker-latest \
                  docker-latest-logrotate \
                  docker-logrotate \
                  docker-engine

sudo rm -rf /etc/systemd/system/docker.service.d/

sudo yum install -y yum-utils device-mapper-persistent-data lvm2

sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

sudo yum install -y docker-ce docker-ce-cli containerd.io

啓動docker,並查看docker服務狀態

$ sudo docker version
Client: Docker Engine - Community
 Version:           19.03.10
 API version:       1.40
 Go version:        go1.13.10
 Git commit:        9424aeaee9
 Built:             Thu May 28 22:18:06 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.10
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.10
  Git commit:       9424aeaee9
  Built:            Thu May 28 22:16:43 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.13
  GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

可以正常啓動了

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章