1. 安裝顯卡驅動
我的顯卡是GTX1080,訪問官網:http://www.geforce.cn/drivers 根據你自己的顯卡型號,選擇相應的顯卡,進行下載勒,下載下來的是一個.run 的文件。
wget https://us.download.nvidia.com/XFree86/Linux-x86_64/418.56/NVIDIA-Linux-x86_64-418.56.run
- 安裝編譯環境:
yum -y install gcc* kernel-devel epel-release dkms
- 編輯grub文件,
vim /etc/default/grub
在“GRUB_CMDLINE_LINUX”中添加
rd.driver.blacklist=nouveau nouveau.modeset=0
- 生成配置
grub2-mkconfig -o /boot/grub2/grub.cfg
- 創建創建blacklist:
vim /etc/modprobe.d/blacklist.conf
添加
blacklist nouveau
- 更新配置:
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r)-nouveau.img
dracut /boot/initramfs-$(uname -r).img $(uname -r)
- 重啓
reboot
- 確認是否禁用了nouveau
lsmod | grep nouveau
- 安裝顯卡驅動:
# 注意:修改kernel 版本爲你安裝的版本
sh NVIDIA-Linux-x86_64-418.56.run --kernel-source-path=/usr/src/kernels/3.10.0-957.10.1.el7.x86_64
- 驗證:
[root@t8t software]# nvidia-smi
Wed Apr 24 16:07:20 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.56 Driver Version: 418.56 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 00000000:01:00.0 Off | N/A |
| 23% 46C P5 25W / 198W | 0MiB / 8119MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
2. 安裝tensorflow-gpu
pip install tensorflow-gpu
我這裏安裝的是1.13.1 版本
根據安裝的tensorflow 版本選擇對應的Bazel, CUDA,cuDNN
3. 安裝Bazel
參考官網:https://docs.bazel.build/versions/master/install-redhat.html#installing-menu
cd /etc/yum.repos.d/
wget https://copr.fedorainfracloud.org/coprs/vbatts/bazel/repo/epel-7/vbatts-bazel-epel-7.repo
yum install bazel
4. 安裝CUDA
首先選擇對應的版本:https://developer.nvidia.com/cuda-toolkit-archive
選擇相關配置,獲取下載鏈接(可在開發者工具中查看,或者直接在DownLoad 標籤複製鏈接)
wget https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda-repo-rhel7-10-0-local-10.0.130-410.48-1.0-1.x86_64
sudo rpm -i cuda-repo-rhel7-10-0-local-10.0.130-410.48-1.0-1.x86_64.rpm
sudo yum clean all
sudo yum install cuda
5. 安裝cuDNN
cuDNN下載需要登錄,可以自行註冊,查看官網(https://developer.nvidia.com/rdp/cudnn-archive),獲取對應文件,下載到本地,通過傳輸工具再傳到Centos系統中。
然後解壓,並放到指定路徑:
tar -xzvf cudnn-10.0-linux-x64-v7.4.2.24.tgz
cp cuda/include/cudnn.h /usr/local/cuda/include/
cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
6. 添加環境變量
vim /etc/profile
# 添加環境變量
export PATH=$PATH:/usr/local/anaconda3/bin:/usr/local/cuda/bin
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda
7. 驗證
(gpu) [root@t8t ~]# python
Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.test.is_built_with_cuda()
True
顯示True則代表tensorflow已經成功使用了GPU。