前言
很多時候配置深度學習的環境都會遇到這樣一個問題,就是參考的不同的開源代碼所用的環境不一定相同,特別是CUDA環境,一般會有CUDA9.0、CUDA10.0、CUDA10.1等版本。所對應的cuDNN也會不同。本文是在已安裝CUDA10.0+cudnn7.6.4的基礎上,加裝CUDA9.0+cudnn7.3.1。
一、gcc降級
由於CUDA 9.0僅支持gcc6.0及以下版本,而Ubuntu 18.04預裝GCC版本爲7.3,因此需要手動進行降級,這裏採用4.8版本。
sudo apt-get install gcc-4.8
sudo apt-get install g++-4.8
裝完後進入到/usr/bin目錄下
ls -l gcc*
會顯示/usr/bin/gcc -> gcc-7.0
,發現gcc鏈接到gcc-7.0, 需要將它改爲鏈接到gcc-4.8,方法如下
sudo mv gcc gcc.bak #備份
sudo ln -s gcc-4.8 gcc #重新鏈接
同理,對g++也做同樣的修改,需要將g++鏈接改爲g+±4.8,
sudo mv g++ g++.bak
sudo ln -s g++-4.8 g++
再查看gcc和g++版本號
gcc -v
g++ -v
均顯示gcc version 4.8
,說明gcc 4.8安裝成功。
二、安裝CUDA9.0
先到NVIDIA官網CUDA9.0下載頁面下載runfile,選擇ubuntu16.04。(注:18.04版本的系統能夠安裝16.04版本對應的CUDA)
文件下載後用以下指令安裝
sudo chmod a+x cuda_9.0.176_384.81_linux.run
sudo ./cuda_9.0.176_384.81_linux.run --no-opengl-libs
由於之前裝CUDA10.0前已經安裝了顯卡驅動,所以在提問是否安裝顯卡驅動時選擇no,其他的選擇默認路徑或者yes即可。
----------------------------------------------------------------------------------
Do you accept the previously read EULA?
accept/decline/quit: accept # 接受CUDA安裝的協議
You are attempting to install on an unsupported configuration. Do you wish to continue?
(y)es/(n)o [ default is no ]: y
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81?
(y)es/(n)o/(q)uit: n # 由於已經安裝顯卡驅動,選擇n
Install the CUDA 9.0 Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
[ default is /usr/local/cuda-9.0 ]: # 工具包安裝地址,默認回車即可
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: n # 添加鏈接**注意這個連接,因爲安裝過另一個版本的cuda10.0,這裏就建議選no,因爲指定該鏈接後會將cuda指向這個新的版本**
Install the CUDA 9.0 Samples?
(y)es/(n)o/(q)uit: y
Enter CUDA Samples Location
[ default is /home/ubuntu ]:
Installing the CUDA Toolkit in /usr/local/cuda-9.0 ...
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so
Missing recommended library: libGL.so
Installing the CUDA Samples in /home/ubuntu ...
Copying samples to /home/ubuntu/NVIDIA_CUDA-9.0_Samples now...
Finished copying samples.
===========
= Summary =
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-9.0
Samples: Installed in /home/ubuntu, but missing recommended libraries
Please make sure that
- PATH includes /usr/local/cuda-9.0/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-9.0/lib64, or, add /usr/local/cuda-9.0/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.0/bin
Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.0/doc/pdf for detailed information on setting up CUDA.
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 9.0 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
sudo <CudaInstaller>.run -silent -driver
Logfile is /tmp/cuda_install_14444.log
# ***至此安裝完成***
安裝完畢之後,修改環境變量
sudo vim /etc/profile
在.bashrc
中添加以下路徑後source ~/.bashrc
:
export PATH=/usr/local/cuda-9.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64:$LD_LIBRARY_PATH
export CUDA_HOME="/usr/local/cuda-9.0:$CUDA_HOME"
或者
【推薦採用】直接指定軟鏈接後的(便於後續CUDA版本切換,只需要重建CUDA軟鏈接而不用改路徑):
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export CUDA_HOME="/usr/local/cuda:$CUDA_HOME"
三、安裝cuDNN
下載對應的cudnn,如cudnn-9.0-linux-x64-v7.3.1.20.tgz
。
解壓
tar -zxvf cudnn-9.0-linux-x64-v7.3.1.20.tgz
將相關文件複製到CUDA路徑
sudo cp cuda/include/cudnn.h /usr/local/cuda-9.0/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda-9.0/lib64/
sudo chmod a+r /usr/local/cuda-9.0/include/cudnn.h
sudo chmod a+r /usr/local/cuda-9.0/lib64/libcudnn*
四、CUDA版本選擇
(1)將CUDA軟鏈接到新安裝的CUDA-9.0,
cd /usr/local
sudo rm -rf cuda #刪除之前創建的軟鏈接
sudo ln -s cuda-9.0 cuda #重建軟鏈接
查看當前 cuda 版本
nvcc -V
nvcc: NVIDIA ® Cuda compiler driver Copyright © 2005-2017 NVIDIA
Corporation Built on Fri_Sep__1_21:08:03_CDT_2017 Cuda compilation
tools, release 9.0, V9.0.176
查看cudnn信息
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
(2)切換到CUDA10.0
sudo rm -rf /usr/local/cuda #刪除之前創建的軟鏈接
sudo ln -s /usr/local/cuda-10.0/ /usr/local/cuda
nvcc -V #查看當前 cuda 版本
五、後記
(1)CUDA9.0安裝時還有4個補丁包,可以視情況安裝。主要是針對cuBLAS的更新。當然既然官方給了補丁包,肯定是安裝最穩妥了。
(2)cuDNN9.0下載時還有3個附加包,可以視情況選擇安裝。
cuDNN v7.3.1 Runtime Library for Ubuntu16.04 (Deb)
cuDNN v7.3.1 Developer Library for Ubuntu16.04 (Deb)
cuDNN v7.3.1 Code Samples and User Guide for Ubuntu16.04 (Deb)
(3)導入TensorFlow時報錯ImportError: /usr/local/cuda-9.0/lib64/libcudnn.so.7: file too short
是動態庫鏈接出了問題,
首先進入到/usr/local/cuda/lib64
下執行rm libcudnn.so.7 libcudnn.so.7.3.1
,
然後切換到下載的cudnn7.3.1.20(該壓縮包解壓後名稱爲cuda)目錄執行cp libcudnn.so.7.3.1 /usr/local/cuda/lib64
,
最後切回 /usr/local/cuda/lib64
再執行ln -s libcudnnn.so.7.3.1 libcudnn.so.7
。
參考資料
[1] Ubuntu18.04系統下裝CUDA9.0
[2] 【NVIDIA】Ubuntu18.04安裝CUDA-9.0 (已安裝CUDA-10.0, CUDNN-7.3.0)