【深度學習】環境搭建—NVIDIA 2080Ti顯卡配置 (20190417)

1. 顯卡驅動安裝

1.1 查看是否存在已經安裝的顯卡

nvidia-smi

nvidia-settings 是顯卡設置
如果已經存在安裝的顯卡,則需要刪除:sudo apt-get remove -purge nvidia*
如果不存在,則不需要刪除
1.2 下載顯卡驅動程序 .run格式
1.3 禁用secure boot
將其設置成disable,如果Secure Boot是灰色的,無法disable。請看這篇博客提到的內容,然後反覆多試幾次,這裏確實挺坑的,要費一些時間。

https://blog.csdn.net/qq_20492405/article/details/79034430
1.4 禁用nouveau
(a) 打開編輯配置文件:sudo gedit /etc/modprobe.d/blacklist.conf
(b) 在最後一行加入 blacklist nouveau
© 執行生效 sudo updaye-initramfs -u
1.5 重啓
reboot
1.6 查看nouveau是否運行

lsmod | grep nouveau        //沒有輸出代表禁用成功

1.7禁止圖形桌面

sudo telinit 3              //運行級別3

進入黑屏後,點擊ctrl+alt+F1 進入命令行模式
1.8 安裝驅動
cd到驅動文件所在目錄
(a)給驅動文件增加可執行權限

sudo chmod a+x NVIDIA-Linux-x86_64-418.56.run

(b)執行

sudo sh ./NVIDIA-Linux-x86_64-418.56.run  -no-opengl-files

-no-opengl-files :爲了防止循環登錄,只安裝驅動,不安裝openGL
ps:安裝時,如果缺少gcc,則根據提示安裝即可。

2.安裝CUDA

英偉達官方下載
下載完成 copy 到特定的文件目錄下。
2.1 關閉屏幕

sudo service lightdm stop                     // ctrl+alt+f1

2.2 激活,執行

sudo chmod a+x cuda_10.1.105_418.39_linux.run
sudo sh cuda_10.1.105_418.39_linux.run

出現協議,填寫accept
之後 yes 安裝
2.3 重啓屏幕

sudo service lightdm start

2.4 測試

cd /usr/local/cuda-10.1/samples/1_Utilities/deviceQuery
sudo make
./deviceQuery

結果出現 result:Pass 表示安裝成功
2.5 環境變量

sudo  gedit /etc/profile

在結尾加上

export PATH=/usr/local/cuda-10.1/bin:$PATH
export LD_LTBRARY_PATH=/usr/local/cida-10.1/lib64:$LD_LIBRARY_PATH

2.6 編輯配置文件,輸入

sudo gedit ~/.bashrc

在結尾出添加:

export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda-8.0

運行

source ~/.bashrc


原博客有一個創建鏈接文件的步驟,但是我沒做成功,最後查看版本也可以
2.7 查看版本
ncvv --version

(base) li@li-System-Product-Name:~$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:17_PST_2019
Cuda compilation tools, release 10.1, V10.1.105

重啓電腦
ps:後來,發現了官方安裝教材。是極好的參考資料,建議直接看官方的(註冊):
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html

3.安裝cuDNN(可選)

針對卷積神經網絡模型優化的數值計算庫。使用他可以將卷積神經網絡的計算速度提升2~3倍,這裏建議一定要安裝。
此時的安裝方法,一定要不去看各種中文的教程,裏面的坑太多(親試)。看官方的教程。(註冊)
cuDNN下載地址:
安裝教程:https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#install-linux
copy官方教材如下:

2.2. Downloading
cuDNN In order to download cuDNN, ensure you are registered for the NVIDIA Developer Program.
//下載時需要先註冊
Go to: NVIDIA cuDNN home page. Click Download. Complete the short
survey and click Submit. Accept the Terms and Conditions. A list of
available download versions of cuDNN displays.
Select the cuDNN version you want to install. A list of available resources displays.
//選擇版本下載,我在選擇時,直接選的“ cuDNN Library for Linux”這一版,因爲這一版本的後綴爲.tgz
,是linux通用版。
2.3. Installing cuDNN on Linux
The following steps describe how to build a cuDNN dependent program.
Choose the installation method that meets your environment needs. //根據你的系統環境,選擇安裝方法
For example, the tar file installation applies to all Linux platforms, and the debian installation package
applies to Ubuntu 14.04 and 16.04. //壓縮包適合所有linux系統,deb適合Ubuntu14,16。

In the following sections:
your CUDA directory path is referred to as /usr/local/cuda/ your cuDNN download path is referred to as
<cudnnpath> //CUDA的路徑是/usr… ; cuDNN的下載路徑要看你的下載地址,以來代替你的下載路徑
2.3.1. Installing from a Tar File Navigate to your directory containing the cuDNN Tar file.
Unzip the cuDNN package. // cd 到你的下載位置,解壓
$ tar -xzvf cudnn-9.0-linux-x64-v7.tgz
Copy the following files into the CUDA Toolkit directory, and change the file permissions.
//複製下列文件到CUDA Toolkit 目錄。改變文件權限,permissions:權限
$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include //copy cudnn.h文件到CUDA Toolkit目錄下
$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64 //copy libcudnn的所有文件到CUDA的lib64下
$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn* //更改權限,設爲可執行
2.3.2. Installing from a Debian File Navigate to your directory containing cuDNN Debian file.
//下載另外三個.Deb的文件包,然後安裝
Install the runtime library,
for example: sudo dpkg -i libcudnn7_7.0.3.11-1+cuda9.0_amd64.deb
Install the developer library, for example:
sudo dpkg -i libcudnn7-devel_7.0.3.11-1+cuda9.0_amd64.deb
Install the code samples and the cuDNN Library User Guide, for example:
sudo dpkg -i libcudnn7-doc_7.0.3.11-1+cuda9.0_amd64.deb
2.4. Verifying To verify that cuDNN is installed and is running properly,
compile the mnistCUDNN sample located in the
/usr/src/cudnn_samples_v7 directory in the debian file.
Copy the cuDNN sample to a writable path.
$cp -r /usr/src/cudnn_samples_v7/ $HOME
Go to the writable path.
$ cd
$HOME/cudnn_samples_v7/mnistCUDNN
Compile the mnistCUDNN sample.
$make clean && make
Run the mnistCUDNN sample.
$ ./mnistCUDNN
If cuDNN is properly installed and running on your Linux system, you will see a
message similar to the following:
Test passed! //安裝成功
2.5. Upgrading from v6 to v7 cuDNN v7 can coexist with previous versions of cuDNN, such as v5 or v6.
2.6. Troubleshooting
Join the NVIDIA Developer Forum to post questions and follow
discussions.

在這裏插入圖片描述

我的實際安裝操作:

3.1 查看cuda的安裝位置

(base) li@li-System-Product-Name:~$ which nvcc
/usr/local/cuda-10.1/bin/nvcc
(base) li@li-System-Product-Name:~$ cd /usr/local
(base) li@li-System-Product-Name:/usr/local$ ls
bin  cuda  cuda-10.1  etc  games  include  lib  man  sbin  share  src
(base) li@li-System-Product-Name:/usr/local/cuda-10.1$ ls
bin       include    libnvvp               nvml     src
doc       jre        NsightCompute-2019.1  nvvm     targets
EULA.txt  lib64      nsightee_plugins      samples  tools
extras    libnsight  NsightSystems-2018.3  share    version.txt

3.2 下載cuDNN

(base) li@li-System-Product-Name:/$ cd ~/Downloads
解壓:tar -xzvf cudnn-10.1-linux-x64-v7.tgz
(base) li@li-System-Product-Name:~/Downloads$ ls
Anaconda3-2019.03-Linux-x86_64.sh  cuda_10.1.105_418.39_linux.run      flash_player_ppapi_linux.x86_64.tar.gz
**cuda**  cudnn-10.1-linux-x64-v7.5.0.56.deb  NVIDIA-Linux-x86_64-418.56.run
//解壓後的文件名字爲cuda

3.3 拷貝文件

(base) li@li-System-Product-Name:~/Downloads$ sudo cp cuda/include/cudnn.h /usr/local/cuda-10.1/include
(base) li@li-System-Product-Name:~/Downloads$ sudo chmod a+r /usr/local/cuda-10.1/include/cudnn.h /usr/local/cuda-10.1/lib64/libcudnn*
(base) li@li-System-Product-Name:~/Downloads$ cd /usr/local/cuda-10.1/lib64/
激活:`$ sudo chmod a+r /usr/local/cuda10.1/include/cudnn.h /usr/local/cuda10.1/lib64/libcudnn*` (注意是copy到cuda還是到cuda10.1)
(base) li@li-System-Product-Name:/usr/local/cuda-10.1/lib64$ ls

在這裏插入圖片描述

3.4鏈接文件

(注:我不知道這一步是不是必須步驟,我剛開始看的中文教程裏面有這一步,我就 操作了,但是官方教程裏面沒有。)

cd /usr/local/cuda-10.1/lib64/  
sudo ln -sf libcudnn.so.7.5.0 libcudnn.so.7  
sudo ln -sf libcudnn.so.7 libcudnn.so  
讓連接生效:
sudo ldconfig

3.5檢測

一個坑

(base) li@li-System-Product-Name:/usr/local/cuda-10.1/lib64$ cp -r /usr/src/cudnn_samples_v7/ $HOME
cp: cannot stat '/usr/src/cudnn_samples_v7/': No such file or directory
(base) li@li-System-Product-Name:/usr/local/cuda-10.1/lib64$ cp -r /usr/src/cudnn_samples_v7 $HOME
cp: cannot stat '/usr/src/cudnn_samples_v7': No such file or directory

跳過,之後安裝下載好的.Deb文件

(base) li@li-System-Product-Name:~/Downloads$ sudo dpkg -i libcudnn7_7.5.0.56-1+cuda10.1_amd64.deb
Selecting previously unselected package libcudnn7.
(Reading database ... 213798 files and directories currently installed.)
Preparing to unpack libcudnn7_7.5.0.56-1+cuda10.1_amd64.deb ...
Unpacking libcudnn7 (7.5.0.56-1+cuda10.1) ...
Setting up libcudnn7 (7.5.0.56-1+cuda10.1) ...
Processing triggers for libc-bin (2.27-3ubuntu1) ...
(base) li@li-System-Product-Name:~/Downloads$ sudo dpkg -i libcudnn7-doc_7.5.0.56-1+cuda10.1_amd64.deb
Selecting previously unselected package libcudnn7-doc.
(Reading database ... 213810 files and directories currently installed.)
Preparing to unpack libcudnn7-doc_7.5.0.56-1+cuda10.1_amd64.deb ...
Unpacking libcudnn7-doc (7.5.0.56-1+cuda10.1) ...
Setting up libcudnn7-doc (7.5.0.56-1+cuda10.1) ...
(base) li@li-System-Product-Name:~/Downloads$ sudo dpkg -i libcudnn7-dev_7.5.0.56-1+cuda10.1_amd64.deb
Selecting previously unselected package libcudnn7-dev.
(Reading database ... 213804 files and directories currently installed.)
Preparing to unpack libcudnn7-dev_7.5.0.56-1+cuda10.1_amd64.deb ...
Unpacking libcudnn7-dev (7.5.0.56-1+cuda10.1) ...
Setting up libcudnn7-dev (7.5.0.56-1+cuda10.1) ...

檢測

小坑
(base) li@li-System-Product-Name:~/Downloads$ sudo cp -r /usr/src/cudnn_samples_v7/ /home/wdong/
(base) li@li-System-Product-Name:~/Downloads$ cd /home/wdong/cudnn_sample_v7/mnistCUDNN
bash: cd: /home/wdong/cudnn_sample_v7/mnistCUDNN: No such file or directory
(base) li@li-System-Product-Name:/home/wdong/mnistCUDNN$ cp -r /usr/src/cudnn_samples_v7/ $HOME
(base) li@li-System-Product-Name:/home/wdong/mnistCUDNN$ cd $HOME/cudnn_samples_v7/mnistCUDNN
(base) li@li-System-Product-Name:~/cudnn_samples_v7/mnistCUDNN$ make clean && make
rm -rf *o
rm -rf mnistCUDNN
Linking agains cublasLt = true
CUDA VERSION: 10010
TARGET ARCH: x86_64
HOST_ARCH: x86_64
TARGET OS: linux
SMS: 30 35 50 53 60 61 62 70 72 75
/usr/local/cuda/bin/nvcc -ccbin g++ -I/usr/local/cuda/include -IFreeImage/include  -m64    -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_75,code=compute_75 -o fp16_dev.o -c fp16_dev.cu
g++ -I/usr/local/cuda/include -IFreeImage/include   -o fp16_emu.o -c fp16_emu.cpp
g++ -I/usr/local/cuda/include -IFreeImage/include   -o mnistCUDNN.o -c mnistCUDNN.cpp
/usr/local/cuda/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_75,code=compute_75 -o mnistCUDNN fp16_dev.o fp16_emu.o mnistCUDNN.o -I/usr/local/cuda/include -IFreeImage/include  -lcublasLt -LFreeImage/lib/linux/x86_64 -LFreeImage/lib/linux -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm
**(base) li@li-System-Product-Name:~/cudnn_samples_v7/mnistCUDNN$ ./mnistCUDNN**
cudnnGetVersion() : 7500 , CUDNN_VERSION from cudnn.h : 7500 (7.5.0)
Host compiler version : GCC 7.3.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 68  Capabilities 7.5, SmClock 1635.0 Mhz, MemSize (Mb) 10986, MemClock 7000.0 Mhz, Ecc=0, boardGroupID=0
Using device 0

Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 0
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.015712 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.037600 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.041728 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.043904 time requiring 203008 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.061536 time requiring 2057744 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!


參考資料

https://zhuanlan.zhihu.com/p/35509593

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章