TensorFlow GPU 與源碼編譯

在深度學習中，服務器的GPU可以極大地加快算法的執行速度，不同版本的TensorFlow默認使用的GPU版本不同，導致與服務器無法兼容，這就需要根據服務器的GPU版本，重新編譯TensorFlow源碼。

歡迎Follow我的GitHub：https://github.com/SpikeKing

檢查GPU

檢測服務器的GPU，用於在編譯中選擇合適的GPU版本。CUDA是NVIDIA發佈的GPU上的並行計算平臺和模型，多數GPU的運行環境都需要CUDA的支持。

導入CUDA的環境變量，具體的cuda版本，在/usr/local中查看。

export PATH=/usr/local/cuda-8.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH

檢查CUDA版本，使用nvcc命令，當前CUDA版本是8.0.61：

nvcc  --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

或，查看version文件，當前CUDA版本是8.0.61：

cat /usr/local/cuda/version.txt

CUDA Version 8.0.61

檢查cuDNN的版本，當前cuDNN版本是6.0.21：

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

#define CUDNN_MAJOR      6
#define CUDNN_MINOR      0
#define CUDNN_PATCHLEVEL 21
--
#define CUDNN_VERSION    (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#include "driver_types.h"

檢測GPU數量和型號，當前服務器的GPU數量是4：

nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.26                 Driver Version: 375.26                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  TITAN X (Pascal)    Off  | 0000:02:00.0     Off |                  N/A |
| 23%   20C    P0    54W / 250W |      0MiB / 12189MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  TITAN X (Pascal)    Off  | 0000:03:00.0     Off |                  N/A |
| 23%   21C    P0    54W / 250W |      0MiB / 12189MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  TITAN X (Pascal)    Off  | 0000:83:00.0     Off |                  N/A |
| 23%   21C    P0    55W / 250W |      0MiB / 12189MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  TITAN X (Pascal)    Off  | 0000:84:00.0     Off |                  N/A |
|  0%   21C    P0    51W / 250W |      0MiB / 12189MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

TensorFlow編譯

參考TensorFlow的官方文檔。

Bazel

Bazel是構建和測試軟件的工具。在Ubuntu服務器中，支持apt工具安裝Bazel，參考。

安裝JDK（Install JDK 8）：

sudo apt-get install openjdk-8-jdk

添加Bazel的發佈URI作爲包源（Add Bazel distribution URI as a package source）

echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -

安裝Bazel（Install and update Bazel）

sudo apt-get install bazel

與官網略有不同，不需要更新apt-get，因爲storage.googleapis.com可能無法訪問。

檢查Bazel版本：

bazel version

輸出，Bazel安裝成功：

Build label: 0.15.0
Build target: bazel-out/k8-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Tue Jun 26 12:10:19 2018 (1530015019)
Build timestamp: 1530015019
Build timestamp as int: 1530015019

導入CUDA

與官方文檔不同，不需要安裝libcupti和cuda-command-line-tools，已經包含在CUDA文件夾中，導入CUDA文件夾即可。

export PATH=/usr/local/cuda-8.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH

編譯源碼

下載TensorFlow源碼

git clone https://github.com/tensorflow/tensorflow

配置編譯，選擇GPU支持。

./configure

Please specify the location of python. [Default is /data2/wcl1/tensorflow/venv/bin/python] ## 選擇Python的版本，2或3

## 其餘選N或n

Do you wish to build TensorFlow with CUDA support? [y/N]: y  # 選擇GPU版本，y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 9.0]: 8.0  # CUDA版本，與服務器一致，8.0

Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 6.0  # cuDNN版本，與服務器一致，6.0

## 其餘選N，或默認

Do you want to use clang as CUDA compiler? [y/N]: N  ## 選擇nvcc
nvcc will be used as CUDA compiler.

## 其餘選N，或默認

Configuration finished

構建GPU包：

bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

libdevice異常：

Cannot find libdevice.10.bc under /usr/local/cuda-8.0

則，修改/usr/local/cuda-8.0/nvvm/libdevice/的libdevice.compute_50.10.bc爲libdevice.10.bc，同時，複製到/usr/local/cuda-8.0/中。

cd /usr/local/cuda-8.0/nvvm/libdevice/
sduo cp libdevice.compute_50.10.bc libdevice.10.bc
sudo cp libdevice.compute_50.10.bc /usr/local/cuda-8.0/libdevice.10.bc

Bazel的構建時間較長，耐心等待…，共9945步，依次執行。

將構建完成的數據，轉換爲whl的pip支持包，默認存放於/tmp/tensorflow_pkg文件夾中：

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

使用pip安裝軟件包，TensorFlow的1.9版本，支持GPU：

pip install /tmp/tensorflow_pkg/tensorflow-1.9.0rc0-cp27-cp27mu-linux_x86_64.whl -i https://pypi.douban.com/simple

檢查是否安裝成功，**退出**TensorFlow文件夾，進入Python的shell，執行：

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

platform異常，原因是位於tensorflow文件夾下，import導入tensorflow的包：

No module named tensorflow.python.platform

則，退出tensorflow文件夾，再進入Python的shell，導入tensorflow包即可。

檢查是否可用GPU，執行：

from tensorflow.python.client import device_lib
local_device_protos = device_lib.list_local_devices()
print "all: %s" % [x.name for x in local_device_protos]

## 輸出
all: [u'/device:CPU:0', u'/device:GPU:0', u'/device:GPU:1', u'/device:GPU:2', u'/device:GPU:3']

注意：遇到在編譯TensorFlow之後，無法執行nvidia-smi，卡住（Stuck），在重啓服務器之後，恢復正常，原因不明，可能GPU資源未完全釋放，又進行二次加載，導致異常。

OK, that’s all! Enjoy it!

TensorFlow GPU 與源碼編譯

檢查GPU

TensorFlow編譯

Bazel

導入CUDA

編譯源碼

使用neovim打造go ide(支持代碼跳轉, 代碼補全, 實時語法檢查)

挑戰程序設計競賽 2.3章習題 poj 3046 Ant Counting

Shell/Python中的用戶名獲取

Ubuntu服務器顯示漢字

使用DialogFragment實現底部彈窗佈局

Python 的 ImportError 錯誤

TFLearn 的安裝錯誤

使用Socket處理跨進程的實時聊天

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

TensorFlow GPU 與 源碼編譯

檢查GPU

TensorFlow編譯

Bazel

導入CUDA

編譯源碼

TensorFlow GPU 與源碼編譯