Tensorflow中的GPU分配方法

默認情況下 TensorFlow 會使用其所能夠使用的所有 GPU，這樣，會出現浪費的情況。

列出當前設備上的GPU和CPU

首先，通過 tf.config.experimental.list_physical_devices，我們可以獲得當前主機上某種特定運算設備類型（如 GPU 或 CPU ）的列表，例如，在一臺具有 4 塊 GPU 和一個 CPU 的工作站上運行以下代碼：

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
cpus = tf.config.experimental.list_physical_devices(device_type='CPU')
print(gpus, cpus)

輸出如下：

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU'), 
PhysicalDevice(name='/physical_device:GPU:2', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:3', device_type='GPU')] 
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]

可見，該工作站具有 4 塊 GPU：GPU:0 、 GPU:1 、 GPU:2 、 GPU:3 ，以及一個 CPU CPU:0 。

控制程序可以使用的GPU

然後，通過 tf.config.experimental.set_visible_devices ，可以設置當前程序可見的設備範圍（當前程序只會使用自己可見的設備，不可見的設備不會被當前程序使用）。例如，如果在上述 4 卡的機器中我們需要限定當前程序只使用下標爲 0、1 的兩塊顯卡（GPU:0 和 GPU:1），可以使用以下代碼：

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
tf.config.experimental.set_visible_devices(devices=gpus[0:2], device_type='GPU')python

使用環境變量 CUDA_VISIBLE_DEVICES 也可以控制程序所使用的 GPU。假設發現四卡的機器上顯卡 0,1 使用中，顯卡 2,3 空閒，Linux 終端輸入:

export CUDA_VISIBLE_DEVICES=2,3

或者直接在命令行前增加：

CUDA_VISIBLE_DEVICES=2,3 python do_something.py

或者在代碼中加入

import os
os.environ['CUDA_VISIBLE_DEVICES'] = "2,3"

即可指定程序只在顯卡 2,3 上運行。

控制GPU的顯存

默認情況下，TensorFlow 將使用幾乎所有可用的顯存，以避免內存碎片化所帶來的性能損失。不過，TensorFlow 提供兩種顯存使用策略，讓我們能夠更靈活地控制程序的顯存使用方式：

僅在需要時申請顯存空間（程序初始運行時消耗很少的顯存，隨着程序的運行而動態申請顯存）；
限制消耗固定大小的顯存（程序不會超出限定的顯存大小，若超出的報錯）。

按需申請顯存空間

可以通過 tf.config.experimental.set_memory_growth 將 GPU 的顯存使用策略設置爲 “僅在需要時申請顯存空間”。以下代碼將所有 GPU 設置爲僅在需要時申請顯存空間：

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
for gpu in gpus:
    tf.config.experimental.set_memory_growth(device=gpu, True)

限制顯存使用最大值

以下代碼通過 tf.config.experimental.set_virtual_device_configuration 選項並傳入 tf.config.experimental.VirtualDeviceConfiguration 實例，設置 TensorFlow 固定消耗 GPU:0 的 1GB 顯存（其實可以理解爲建立了一個顯存大小爲 1GB 的 “虛擬 GPU”）：

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
tf.config.experimental.set_virtual_device_configuration(
    gpus[0],
    [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])

Tensorflow 1.x的設置方法

TensorFlow 1.X 的 Graph Execution 下，可以在實例化新的 session 時傳入 tf.compat.v1.ConfigPhoto 類來設置 TensorFlow 使用顯存的策略。具體方式是實例化一個 tf.ConfigProto 類，設置參數，並在創建 tf.compat.v1.Session 時指定 Config 參數。以下代碼通過 allow_growth 選項設置 TensorFlow 僅在需要時申請顯存空間：

config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.compat.v1.Session(config=config)

以下代碼通過 per_process_gpu_memory_fraction 選項設置 TensorFlow 固定消耗 40% 的 GPU 顯存：

config = tf.compat.v1.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
tf.compat.v1.Session(config=config)

單 GPU 模擬多 GPU 環境

當我們的本地開發環境只有一個 GPU，但卻需要編寫多 GPU 的程序在工作站上進行訓練任務時，TensorFlow 爲我們提供了一個方便的功能，可以讓我們在本地開發環境中建立多個模擬 GPU，從而讓多 GPU 的程序調試變得更加方便。以下代碼在實體 GPU GPU:0 的基礎上建立了兩個顯存均爲 2GB 的虛擬 GPU。

gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_virtual_device_configuration(
    gpus[0],
    [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=2048),
     tf.config.experimental.VirtualDeviceConfiguration(memory_limit=2048)])

我們在單機多卡訓練的代碼前加入以上代碼，即可讓原本爲多 GPU 設計的代碼在單 GPU 環境下運行。當輸出設備數量時，程序會輸出：

Number of devices: 2

參考資料：https://tf.wiki/zh/basic/tools.html

dupei

發佈了54 篇原創文章 · 獲贊 7 · 訪問量 26萬+

私信關注

Tensorflow中的GPU分配方法

Tensorflow中的GPU分配方法

列出當前設備上的GPU和CPU

控制程序可以使用的GPU

控制GPU的顯存

按需申請顯存空間

限制顯存使用最大值

Tensorflow 1.x的設置方法

單 GPU 模擬多 GPU 環境

Python 潮流週刊#52：Python 處理 Excel 的資源

李宏毅《Speech Recognition》學習筆記3 - Beam Search

python性能分析

李宏毅《Deep Learning》學習筆記 - RNN

李宏毅《Deep Learning》學習筆記 - seq2seq

李宏毅《Speech Recognition》學習筆記1 - 語音識別概念

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結