我的AI之路(35)--使用tensorflow和pytorch的docker鏡像

從docker遠程倉庫拉取自己想要的鏡像,首先最好查看docker鏡像的版本號TAG,以確認版本是自己想要的,查看docker鏡像的版本號,需先打開網頁:

https://hub.docker.com/r/library/

然後左上角輸入名字搜索想要的鏡像,比如tensorflow:

在列出的結果中點擊進入相應的主頁後,點擊Tags標籤,查找自己想要的版本,然後點擊右邊相應的複製按鈕複製下對應的docker pull完整命令:

然後執行這個命令即可下載這個image並裝入本地庫中,例如我選擇的是tensorflow2.0 gpu版,針對python3且帶jupyter的版本:

#####################################################################
#使用Tensorflow2.0.0-gpu-py3-jupyter鏡像
#####################################################################

docker pull tensorflow/tensorflow:2.0.0-gpu-py3-jupyter  

...
Digest: sha256:613cdca993785f7c41c744942871fc5358bc0110f6f5cb5b00a4b459356d55e4
Status: Downloaded newer image for tensorflow/tensorflow:2.0.0-gpu-py3-jupyter
docker.io/tensorflow/tensorflow:2.0.0-gpu-py3-jupyter

#創建並運行容器,還可以使用類似 --env NVIDIA_VISIBLE_DEVICES=0,1這樣的選項指定哪些GPU可見

docker run --runtime=nvidia -d -it -p 8888:8888 tensorflow/tensorflow:2.0.0-gpu-py3-jupyter bash

#下面配置jupyter:

jupyter notebook --generate-config
#generate file under /home/USERNAME/.jupyter/jupyter_notebook_config.py

As of notebook 5.3, the first time you log-in using a token, the notebook server should give you the opportunity to setup a password from the user interface.
You will be presented with a form asking for the current _token_, as well as your _new_ _password_ ; enter both and click on Login and setup new password.

#修改jupyter口令
jupyter notebook password
Enter password:  ****
Verify password: ****
[NotebookPasswordApp] Wrote hashed password to /Users/you/.jupyter/jupyter_notebook_config.json

#查詢容器的id:

docker ps |grep tensorflow
CONTAINER ID        IMAGE                                         COMMAND                  CREATED             STATUS                 PORTS                                                                                                                                                                    NAMES
5bb3b5deb320        tensorflow/tensorflow:2.0.0-gpu-py3-jupyter   "bash -c 'source /et…"   About an hour ago   Up 6 seconds           0.0.0.0:8888->8888/tcp

#重啓container:
docker restart 5bb3b5deb320

#至此jupyter到此可連了,瀏覽器裏輸入下面的地址:
https://192.168.1.205:8888

 

#####################################################################
#使用PyTorch1.3.1鏡像
#####################################################################
#目前latest等同於1.3-cuda10.1-cudnn7-runtime,根據需要選擇pull RT版或者Dev版,我使用的runtime:
docker pull pytorch/pytorch:latest
#runtime version
docker pull pytorch/pytorch:1.3-cuda10.1-cudnn7-runtime
#develop version
docker pull pytorch/pytorch:1.3-cuda10.1-cudnn7-devel

nvidia-docker run -dit --name pytorch1.3 --env NVIDIA_VISIBLE_DEVICES=2 -p 8888:8888 pytorch/pytorch:1.3-cuda10.1-cudnn7-runtime

 

順帶說一下一個包含了多個不同深度學習框架的docker image叫Deepo,裏面安裝的框架不一定都是最新版的,可以根據需要下載使用:

### Deepo 包含多種框架的深度學習環境 ################################
docker pull ufoym/deepo

#國內的鏡像網站下載
docker pull registry.docker-cn.com/ufoym/deepo
docker pull hub-mirror.c.163.com/ufoym/deepo
docker pull docker.mirrors.ustc.edu.cn/ufoym/deepo

#測試

docker run --gpus all --rm ufoym/deepo nvidia-smi

#運行容器

docker run --gpus all -it ufoym/deepo bash
docker run --gpus all -it -v /<home>/data:/data -v /<home>/config:/config ufoym/deepo bash

#只獲取某個框架的image:
docker pull ufoym/deepo:tensorflow
#使用Jupyter notebook
docker run --gpus all -it -p 8888:8888 --ipc=host ufoym/deepo jupyter notebook --no-browser --ip=0.0.0.0 --allow-root --NotebookApp.token

#解決pytorch訓練模型過程中可能出現的共享內存不足的問題的辦法:
####Please note that some frameworks (e.g. PyTorch) use shared memory to share data between processes, 
####so if multiprocessing is used the default shared memory segment size that container runs with is not enough, 
####and you should increase shared memory size either with --ipc=host or --shm-size command line options to docker run:
docker run --gpus all -it --ipc=host ufoym/deepo bash

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章