Centos8 運行CUDA10.2+Tensorflow1.15.0

python版本必須正確截止2019.12.04日請不要用python3.8.0 版本，
因爲安裝不了tensorflow, python 3.6.8 是可以的。

#軟件安裝過程

Python3.6.8 安裝 centos8 自帶的，或者從源代碼安裝
mkdir -p /var/server/
cd /var/server
wget https://www.python.org/ftp/python/3.6.8/Python-3.6.8.tgz
tar -zxvf /Python-3.6.8.tgz
cd /Python-3.6.8
sudo ./configure --enable-optimizations
sudo make altinstall
pip3 install --upgrade pip -i https://pypi.tuna.tsinghua.edu.cn/simple

驗證是否安裝成功 python3

Centos8 NVIDIA 顯卡驅動安裝
https://www.nvidia.com/Download/index.aspx?lang=en-us

驗證是否安裝成功:
nvidia-smi


Wed Dec  4 04:35:16 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 105...  Off  | 00000000:02:00.0  On |                  N/A |
| 62%   66C    P0    N/A /  95W |   4024MiB /  4039MiB |     88%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1470      G   /usr/libexec/Xorg                             39MiB |
|    0      1840      G   /usr/bin/gnome-shell                          42MiB |
|    0      6996      C   .../tensorflow-gpu-1.15.0/venv/bin/python3  3925MiB |
+-----------------------------------------------------------------------------+

CUDA 10.2 安裝
驗證是否安裝成功
cd /usr/local/cuda/samples
make
cd 1_Utilities/
make
ls
./deviceQuery

  deviceQuery  deviceQuery.cpp  deviceQuery.o  Makefile  NsightEclipse.xml  readme.txt
(venv) [root@localhost deviceQuery]# ./deviceQuery
./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1050 Ti"
 CUDA Driver Version / Runtime Version          10.2 / 10.2
 CUDA Capability Major/Minor version number:    6.1
 Total amount of global memory:                 4040 MBytes (4235919360 bytes)
 ( 6) Multiprocessors, (128) CUDA Cores/MP:     768 CUDA Cores
 GPU Max Clock rate:                            1493 MHz (1.49 GHz)
 Memory Clock rate:                             3504 Mhz
 Memory Bus Width:                              128-bit
 L2 Cache Size:                                 1048576 bytes
 Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
 Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
 Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
 Total amount of constant memory:               65536 bytes
 Total amount of shared memory per block:       49152 bytes
 Total number of registers available per block: 65536
 Warp size:                                     32
 Maximum number of threads per multiprocessor:  2048
 Maximum number of threads per block:           1024
 Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
 Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
 Maximum memory pitch:                          2147483647 bytes
 Texture alignment:                             512 bytes
 Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
 Run time limit on kernels:                     Yes
 Integrated GPU sharing Host Memory:            No
 Support host page-locked memory mapping:       Yes
 Alignment requirement for Surfaces:            Yes
 Device has ECC support:                        Disabled
 Device supports Unified Addressing (UVA):      Yes
 Device supports Compute Preemption:            Yes
 Supports Cooperative Kernel Launch:            Yes
 Supports MultiDevice Co-op Kernel Launch:      Yes
 Device PCI Domain ID / Bus ID / location ID:   0 / 2 / 0
 Compute Mode:
    < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1
Result = PASS
(venv) [root@localhost deviceQuery]# pwd
/usr/local/cuda/samples/1_Utilities/deviceQuery
(venv) [root@localhost deviceQuery]# ./deviceQuery
./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

安裝tensorflow
mkdir -p /var/server/tensorflow-gpu-1.15.0
cd /var/server/tensorflow-gpu-1.15.0
python3 -m venv venv
source venv/bin/activate

pip install tensorflow-gpu -i https://pypi.tuna.tsinghua.edu.cn/simple
或你想用比較低的版本。
pip install tensorflow-gpu=1.15.0 -i https://pypi.tuna.tsinghua.edu.cn/simple

export PATH=/usr/local/cuda/bin $KaTeX parse error: Expected '}', got 'EOF' at end of input: {PATH:+:$ {PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64: $KaTeX parse error: Expected '}', got 'EOF' at end of input: \dotsLIBRARY_PATH:+:$ {LD_LIBRARY_PATH}}

驗證tensorflow-gpu 是否可以正常工作

source /var/server/tensorflow-gpu-1.15.0/venv/bin/activate
#“想要測試tensorflow是否可以使用GPU：”
import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

然後看到輸出:

2019-12-04 09:47:43.083342: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-12-04 09:47:43.083399: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib64:
2019-12-04 09:47:43.083440: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib64:
2019-12-04 09:47:43.083478: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib64:
2019-12-04 09:47:43.083516: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib64:
2019-12-04 09:47:43.083553: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib64:
2019-12-04 09:47:43.083593: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/cuda/lib64:/usr/local/cuda/lib64:
2019-12-04 09:47:43.083599: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

發現很多庫都找不到.

依次搜索每個缺失的lib
find / -name “libcublas.so."
find / -name " libcufft.so.”
…

然後做軟鏈接到/usr/local/cuda/lib64/

ln -s libcudart.so.10.2 libcudart.so.10.0
ln -s libcublas.so.10.2 libcublas.so.10.0
ln -s libcufft.so.10.2 libcufft.so.10.0
ln -s libcurand.so.10.2 libcurand.so.10.0
ln -s libcusolver.so.10.2 libcusolver.so.10.0
ln -s libcusparse.so.10.2 libcusparse.so.10.0
ln -s libcudnn.so.7 libcudnn.so.7

特別注意的是:libcublas.so.10 並不在/usr/local/cuda 目錄下，

cudnn 需要手動安裝
https://developer.nvidia.com/rdp/cudnn-download

安裝方法 https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html

然後再測試tensorflow-gpu 是否可以正常運行，
source venv/bin/activate
python3
import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

如果還是出錯，
find / -name " libcudnn.so*"
找到實際路徑後，一樣做個軟鏈接到/usr/local/cuda/lib64 裏

至此，終於可以跑起來了。

我運行的show and Tell (看圖說話）的模型訓練，CPU是 3.7GHZ，8核心的，每秒處理是1.45秒/步迭代。
用GPU計算後，0.30秒/步，速度提升了大概5倍。

Centos8 運行CUDA10.2+Tensorflow1.15.0

vue綁定對象，綁定的值不改變的問題

Spring Cloud 部署時如何使用 Kubernetes 作爲註冊中心和配置中心

KubeKey 部署 K8s v1.28.8 實戰

記一些CISP-PTE題目解析

docker 掛載攝像頭出現的問題解決辦法

把docker狀態變成kafka實時數據流---一行代碼寫了2天的shell代碼

SpringCloud gateway 動態路由入門

簡單的命令快速下載遠程各種docker 鏡像 github 庫，各種軟件資料。

開源物聯網雲平臺 Thingsboard入門

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結