在編譯安裝好TensorFlow後,可下載示例代碼運行,但在執行run_all.sh時,出現如下錯誤。該錯誤意思就是CuDNN的runtime版本和編譯時指定的版本不同。
2018-05-08 09:00:18.042137: E tensorflow/stream_executor/cuda/cuda_dnn.cc:448] Loaded runtime CuDNN library: 7.0.5 but source was compiled with: 7.1.3. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2018-05-08 09:00:18.042768: F tensorflow/core/kernels/conv_ops.cc:713] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo<T>(), &algorithms)
我這裏出現該問題的原因是在安裝CuDNN 7.0.5時創建的軟鏈接沒有更改到7.1.3,可以通過如下命令查看軟鏈接。
$ ll /usr/local/cuda/lib64/
...
lrwxrwxrwx 1 root root 13 1月 27 12:25 libcudnn.so -> libcudnn.so.7*
lrwxrwxrwx 1 root root 17 1月 27 12:25 libcudnn.so.7 -> libcudnn.so.7.0.5*
-rwxr-xr-x 1 root root 287624224 4月 28 08:55 libcudnn.so.7.0.5*
-rwxr-xr-x 1 root root 331455744 4月 28 08:55 libcudnn.so.7.1.3*
...
可以看到libcudnn.so.7指向libcudnn.so.7.0.5*,而非libcudnn.so.7.1.3*,因而解決方案可以是重新創建軟鏈接
$ cd /usr/local/cuda/lib64
$ sudo rm libcudnn.so.7.0.5*
$ sudo chmod +r libcudnn.so.7.1.3
$ sudo ln -sf libcudnn.so.7.1.3 libcudnn.so.7
$ sudo ln -sf libcudnn.so.7 libcudnn.so
$ sudo ldconfig
$ ll
...
lrwxrwxrwx 1 root root 13 5月 8 09:13 libcudnn.so -> libcudnn.so.7*
lrwxrwxrwx 1 root root 17 5月 8 09:13 libcudnn.so.7 -> libcudnn.so.7.1.3*
-rwxr-xr-x 1 root root 331455744 4月 28 08:55 libcudnn.so.7.1.3*
...