首先模型需要放在GPU上,形如:
device = torch.device("cuda" if use_cuda else "cpu")
model = LSTM(args.timestep, args.batch_size, args.audio_window).to(device)
然后使用nn.nn.DataParallel,
model = nn.DataParallel(model, device_ids=[0,1,2,3])
由于定义的lstm模型中存在如下类似的初始函数
def init_hidden(self, batch_size, use_gpu=True):
if use_gpu: return torch.zeros(1, batch_size, 256).cuda()
else: return torch.zeros(1, batch_size, 256)
多卡训练时,使用网上的方法会出现各种问题,因此,直接将产生hidden的变量放在dataloader中,形如:
class RawDataset(data.Dataset):
def __init__(self, raw_file, list_file, audio_window):
""" raw_file: train-clean-100.h5
list_file: list/training.txt
audio_window: 20480
"""
self.raw_file = raw_file
self.audio_window = audio_window
self.ut