首先,pull Nvidia官方給出的CUDA鏡像
docker pull nvidia/cuda:10.0-runtime-ubuntu18.04
具體版本可以查詢:
https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/supported-tags.md
修改鏡像
進入鏡像環境
docker run -it nvidia/cuda:10.0-runtime-ubuntu18.04 /bin/bash
安裝python
apt-get update
apt-get install python3.6
安裝pip
apt install python3-pip
在安裝tensorflow==1.15的時候出現了問題,原因是pip的版本太低了,所以更新pip:
pip3 install --upgrade pip
裝完tensorflow提示setuptools的版本太低,所以升級一下:
pip3 install --upgrade pip
opencv安裝完成後並不能使用,提示缺少包:
apt install libsm6 libxext6 libxrender-dev
換apt源
首先,備份默認的鏡像源
cd /etc/apt
cp sources.list sources.list.backup
然後,將sources.list
中的內容更換爲阿里源(sources.list
可以在鏡像外做好後複製到鏡像中去):
deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
注意:bionic是Ubuntu18.04的代號,其他版本做相應替換即可:
系統版本 | 代號 |
---|---|
Ubuntu 12.04 (LTS) | precise |
Ubuntu 14.04 (LTS) | trusty |
Ubuntu 15.04 | vivid |
Ubuntu 15.10 | wily |
Ubuntu 16.04 (LTS) | xenial |
換pip源
如果沒有~/.pip/pip.conf
:
mkdir ~/.pip
vim pip.conf
pip.conf中內容:
[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
[install]
trusted-host = https://pypi.tuna.tsinghua.edu.cn
清理pip緩存
rm -rf ~/.cache/pip
安裝openssh、curl,這樣集羣上可以打開網頁端ssh
apt install openssh-server curl
將python命令指向python3
第一步:將原來的python文件進行備份(如果有的話)
sudo cp /usr/bin/python /usr/bin/python_bak
第二步:刪除原來指向python2的文件
sudo rm /usr/bin/python
第三步:重新指向python3版本
sudo ln -s /usr/bin/python3 /usr/bin/python
其他問題
- import tensorflow時遇到
libgomp.so.1
cannot open shared object file: No such file or directory
sudo apt-get install libgomp1
- 檢查tensorflow能否使用GPU:
import tensorflow as tf
tf.test.gpu_device_name()