首先模型需要放在GPU上,形如:
device = torch.device("cuda" if use_cuda else "cpu")
model = LSTM(args.timestep, args.batch_size, args.audio_window).to(device)
然後使用nn.nn.DataParallel,
model = nn.DataParallel(model, device_ids=[0,1,2,3])
由於定義的lstm模型中存在如下類似的初始函數
def init_hidden(self, batch_size, use_gpu=True):
if use_gpu: return torch.zeros(1, batch_size, 256).cuda()
else: return torch.zeros(1, batch_size, 256)
多卡訓練時,使用網上的方法會出現各種問題,因此,直接將產生hidden的變量放在dataloader中,形如:
class RawDataset(data.Dataset):
def __init__(self, raw_file, list_file, audio_window):
""" raw_file: train-clean-100.h5
list_file: list/training.txt
audio_window: 20480
"""
self.raw_file = raw_file
self.audio_window = audio_window
self.ut