最近在使用飛槳OCR,有幾個特殊的符號需要進行識別,手上只有兩臺機器,一臺1080TI單卡(windows 11),一臺1080Ti雙卡(linux 22.04),習慣性追新到飛槳最高支持的cuda11.7,其實1080Ti到cuda10就夠用了,後面的新版本差沒有明顯的性能提升。
windows上無腦安裝,linux上安裝比較麻煩,記錄下安裝過程。
cuda、cudnn對nvidia驅動以及內核有依賴關係,cuda 11.7最低驅動版本是450.80,詳細請看https://docs.nvidia.com/deeplearning/cudnn/support-matrix/index.html#cudnn-versions-linux
注意:使用離線方式進行安裝,要註冊Nvidia的開發者賬號才能下載相應的安裝包。
-
清理之前殘留的nvidia驅動
sudo apt autoremove -y nvidia* --purge sudo rm /etc/apt/sources.list.d/cuda* sudo apt-get autoremove && sudo apt-get autoclean sudo rm -rf /usr/local/cuda*
-
更新顯卡驅動
ubuntu-drivers devices sudo ubuntu-drivers autoinstall sudo apt install -y nvidia-driver-525 sudo reboot
重啓後使用
nvidia-smi
檢測驅動安裝是否正確 -
安裝 cuda 11.7.1: https://developer.nvidia.com/cuda-toolkit-archive https://developer.nvidia.com/cuda-11-7-1-download-archive
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600 wget https://developer.download.nvidia.com/compute/cuda/11.7.1/local_installers/cuda-repo-ubuntu2204-11-7-local_11.7.1-515.65.01-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu2204-11-7-local_11.7.1-515.65.01-1_amd64.deb sudo cp /var/cuda-repo-ubuntu2204-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/ sudo apt update sudo apt -y install cuda-11-7
-
安裝 cudnn 8.9.3 for cuda 11: https://developer.nvidia.com/rdp/cudnn-download
wget https://developer.nvidia.com/downloads/compute/cudnn/secure/8.9.3/local_installers/11.x/cudnn-local-repo-ubuntu2204-8.9.3.28_1.0-1_amd64.deb/ sudo dpkg -i cudnn-local-repo-ubuntu2204-8.9.3.28_1.0-1_amd64.deb sudo cp /var/cudnn-local-repo-ubuntu2204-8.9.3.28/cudnn-local-7F7A158C-keyring.gpg /usr/share/keyrings/ sudo apt update sudo apt -y install libcudnn8=8.9.3.28-1+cuda11.8 libcudnn8-dev=8.9.3.28-1+cuda11.8
-
安裝 nccl 2.18.3 for cuda 11: https://developer.nvidia.com/nccl/nccl-download
wget https://developer.nvidia.com/downloads/compute/machine-learning/nccl/secure/2.18.3/agnostic/x64/nccl_2.18.3-1+cuda11.0_x86_64.txz/ tar xvf nccl_2.18.3-1+cuda11.0_x86_64.txz sudo mv nccl_2.18.3-1+cuda11.0_x86_64 /usr/local/nccl_2.18.3
-
安裝 tensorRT 8.6.1 for cuda 11: https://developer.nvidia.com/nvidia-tensorrt-8x-download
wget https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/secure/8.6.1/local_repos/nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-11.8_1.0-1_amd64.deb sudo dpkg -i nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-11.8_1.0-1_amd64.deb sudo cp $ ls /var/nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-11.8/nv-tensorrt-local-0628887B-keyring.gpg /usr/share/keyrings/ sudo apt update sudo apt -y install tensorrt=8.6.1.6-1+cuda11.8
-
添加路徑到環境變量或者
.bashrc
export PATH=/usr/local/cuda-11.7/bin:~/.local/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64:/usr/local/nccl_2.18.3/lib:$LD_LIBRARY_PATH
使用
nvcc --version
檢測cuda版本