一共有兩種方法
首先先執行
sudo apt-get install libc6-dev build-essential
親測如果不執行,手動安裝nvidia driver會報錯
使用apt的方法(比較方便,簡單)
參考了tensoeflow官網的安裝方法,
Ubuntu 18.04 (CUDA 10.1)
# Add NVIDIA package repositories
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.243-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804_10.1.243-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update
# Install NVIDIA driver
sudo apt-get install --no-install-recommends nvidia-driver-418
# Reboot. Check that GPUs are visible using the command: nvidia-smi
# Install development and runtime libraries (~4GB)
sudo apt-get install --no-install-recommends \
cuda-10-1 \
libcudnn7=7.6.4.38-1+cuda10.1 \
libcudnn7-dev=7.6.4.38-1+cuda10.1
# Install TensorRT. Requires that libcudnn7 is installed above.
sudo apt-get install -y --no-install-recommends libnvinfer6=6.0.1-1+cuda10.1 \
libnvinfer-dev=6.0.1-1+cuda10.1 \
libnvinfer-plugin6=6.0.1-1+cuda10.1
但按照步驟走後出現了問題,
1是安裝nvidia driver的時候,儘管輸入的是418版本,但最終安裝的還是430版本,其對應的是cuda10.2。(這個可能不直接影響後續安裝)
2、安裝cuda的時候,不是報“unmet dependence"就是”package is damage,這樣就需要刪除nvidia driver了
3 nvcc不管用
解決方法:參考
https://www.pugetsystems.com/labs/hpc/How-To-Install-CUDA-10-together-with-9-2-on-Ubuntu-18-04-with-support-for-NVIDIA-20XX-Turing-GPUs-1236/
完整的安裝方案:
sudo apt-get install libc6-dev build-essential
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.243-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update
sudo apt-get install --no-install-recommends \
cuda-10-0 \
libcudnn7=7.6.4.38-1+cuda10.0 \
libcudnn7-dev=7.6.4.38-1+cuda10.0
vi ~/.bashrc
export PATH=/usr/local/cuda/bin:${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64
(你可以根據需要的cuda版本,改變deb包,這裏是cuda10.0)
以上的方法可以自動給安裝好你的顯卡驅動,
如果要自己獨立安裝顯卡,記得安裝完3後重啓
方法二
nvidia driver、cuda、cudnn全部手動下載和安裝
!!最好以cuda->cudnn->driver的順序安裝,因爲有時候不管driver版本再新,都無法成功安裝cuda。
driver下載:
nvidia driver地址:
https://www.nvidia.com/Download/index.aspx
或者
$ sudo add-apt-repository ppa:graphics-drivers/ppa
$ sudo apt-get update
$ sudo apt-get install nvidia-driver-418
cuda 版本
cat /usr/local/cuda/version.txt
cudnn 版本
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
刪除nvidia driver:
dpkg -l | grep -i nvidia
sudo apt-get remove --purge '^nvidia-.*'
sudo apt-get purge nvidia*