Pytorch構建模型的技巧小結（1）

1、保存模型參數和加載模型參數

（A）保存參數

# 2 ways to save the net
torch.save(net1, 'net.pkl')  # save entire net
torch.save(net1.state_dict(), 'net_params.pkl')   # save only the parameters

（B）加載參數

# copy net1's parameters into net3
net3.load_state_dict(torch.load('net_params.pkl'))
prediction = net3(x)

上面出現的net1和net3都是nn.Module的實例。

2、模型參數的鉗位

# Clip weights of discriminator
for p in discriminator.parameters():
      p.data.clamp_(-opt.clip_value, opt.clip_value)

p是Module（nn.Module）—— discriminator的參數。這段代碼是實現WGAN時用到的。鉗位不僅可以實現WGAN，而且它可以消除在訓練中出現的nan情況，但鉗位的大小很關鍵。

3、模型的CUDA化

在配有CUDA的訓練過程中，模型和數據都需要加載到CUDA中，pytorch的張量有兩種類型，以Float爲例：用於CPU——torch.FloatTensor、用於CUDA——torch.cuda.FloatTensor，以下是完整列表：

\begin{array}{clcr} n & CPU & CUDA & Desc. \\ 1 & torch.FloatTensor & torch.cuda.FloatTensor & 32-bit floating point \\ 2 & torch.DoubleTensor & torch.cuda.DoubleTensor & 64-bit floating point \\ 3 & N/A & torch.cuda.HalfTensor & 16-bit floating point \\ 4 & torch.ByteTensor & torch.cuda.ByteTensor & 8-bit integer (unsigned) \\ 5 & torch.CharTensor & torch.cuda.CharTensor & 8-bit integer (signed) \\ 6 & torch.ShortTensor & torch.cuda.ShortTensor & 16-bit integer (signed) \\ 7 & torch.IntTensor & torch.cuda.IntTensor & 32-bit integer (signed) \\ 8 & torch.LongTensor & torch.cuda.LongTensor & 64-bit integer (signed) \end{array}

CPU中Tensor經常需要與CUDA中Tensor交換，交換的方法如下：
方法一：

MODEL_NAME = 'VanillaGAN'
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

...2.
def to_cuda(x):
    return x.to(DEVICE)

'''3...模型的CUDA化'''
D = to_cuda(Discriminator())
G = to_cuda(Generator())

'''4...輸入數據的CUDA化'''
x = to_cuda(images)

'''5...兩個CUDA化的模型結合'''
x_outputs = D(x)

"""6...從CUDA回到CPU"""
def get_sample_image(G, n_noise=100):
    """
        save sample 100 images
    """
    for num in range(10):
        for i in range(10):
            z = to_cuda(torch.randn(1, n_noise))
            y_hat = G(z)
            line_img = torch.cat((line_img, y_hat.view(28, 28)), dim=1) if i > 0 else y_hat.view(28, 28)
        all_img = torch.cat((all_img, line_img), dim=0) if num > 0 else line_img
    img = all_img.cpu().data.numpy()
    return img

方法二：使用 .cuda() 和 .cpu()

# setup input tensors
x = torch.FloatTensor(opt.batch_size, nc, opt.image_size, opt.image_size)
z = torch.FloatTensor(opt.batch_size, nz, 1, 1)
noise = torch.FloatTensor(opt.batch_size, 1, 1, 1)

if opt.cuda:
    netGx.cuda(), netGz.cuda()
    netDx.cuda(), netDz.cuda(), netDxz.cuda()
    x, z, noise = x.cuda(), z.cuda(), noise.cuda()

從cuda中回來：

def test(dataloader, epoch):
    real_cpu_first, _ = iter(dataloader).next()
    real_cpu_first = real_cpu_first.mul(0.5).add(0.5)  # denormalize

    if opt.cuda:
        real_cpu_first = real_cpu_first.cuda()

    netGx.eval(), netGz.eval()  # switch to test mode
    latent = netGz(Variable(real_cpu_first, volatile=True))

    # removes last sigmoid activation to visualize reconstruction correctly
    mu, sigma = latent[:, :opt.nz], latent[:, opt.nz:].exp()
    recon = netGx(mu + sigma)

    vutils.save_image(recon.data, '{0}/reconstruction.png'.format(opt.experiment))
    vutils.save_image(real_cpu_first, '{0}/real_samples.png'.format(opt.experiment))

4、dataloader的逐次迭代

real_cpu_first, _ = iter(dataloader).next()

該方法可以用在單元測試上，檢查dataloader的輸出數據。以下是顯示dataloader一次批處理圖像的方法：

def imshow(inp, title=None):
    """Imshow for Tensor."""
    inp = inp.numpy().transpose((1, 2, 0))
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    inp = std * inp + mean
    inp = np.clip(inp, 0, 1)
    plt.imshow(inp)
    if title is not None:
        plt.title(title)
    plt.pause(0.001)  # pause a bit so that plots are updated


# Get a batch of training data
inputs, classes = next(iter(dataloaders['train']))

# Make a grid from batch
out = torchvision.utils.make_grid(inputs)

imshow(out, title=[class_names[x] for x in classes])

5、構建一個模型的裝配模式

class CNN(nn.Module):
    def __init__(self, nc, input_size, hparams, ngpu=1, leaky_slope=0.01, std=0.01):
        super(CNN, self).__init__()
        self.ngpu = ngpu  # num of gpu's to use
        self.leaky_slope = leaky_slope  # slope for leaky_relu activation
        self.std = std  # standard deviation for weights initialization
        self.input_size = input_size  # expected input size

        main = nn.Sequential()
        in_feat, num = nc, 0
        for op, k, s, out_feat, b, bn, dp, h in hparams:
            # add operation: conv2d or convTranspose2d
            if op == 'conv2d':
                main.add_module(
                    '{0}.pyramid.{1}-{2}.conv'.format(num, in_feat, out_feat),
                    nn.Conv2d(in_feat, out_feat, k, s, 0, bias=b))
            elif op == 'convt2d':
                main.add_module(
                    '{0}.pyramid.{1}-{2}.convt'.format(num,in_feat, out_feat),
                    nn.ConvTranspose2d(in_feat, out_feat, k, s, 0, bias=b))
            else:
                raise Exception('Not supported operation: {0}'.format(op))
            num += 1
            # add batch normalization layer
            if bn:
                main.add_module(
                    '{0}.pyramid.{1}-{2}.batchnorm'.format(num, in_feat, out_feat),
                    nn.BatchNorm2d(out_feat))
                num += 1
            # add dropout layer
            main.add_module(
                '{0}.pyramid.{1}-{2}.dropout'.format(num, in_feat, out_feat),
                nn.Dropout2d(p=dp))
            num += 1
            # add activation
            if h == 'leaky_relu':
                main.add_module(
                    '{0}.pyramid.{1}-{2}.leaky_relu'.format(num, in_feat, out_feat),
                    nn.LeakyReLU(self.leaky_slope, inplace=True))
            elif h == 'sigmoid':
                main.add_module(
                    '{0}.pyramid.{1}-{2}.sigmoid'.format(num, in_feat, out_feat),
                    nn.Sigmoid())
            elif h == 'maxout':
                # TODO: implement me
                # https://github.com/IshmaelBelghazi/ALI/blob/master/ali/bricks.py#L338-L380
                raise NotImplementedError('Maxout is not implemented.')
            elif h == 'relu':
                main.add_module(
                    '{0}.pyramid.{1}-{2}.relu'.format(num, in_feat, out_feat),
                    nn.ReLU(inplace=True))
            elif h == 'tanh':
                main.add_module(
                    '{0}.pyramid.{1}-{2}.tanh'.format(num, in_feat, out_feat),
                    nn.Tanh())
            elif h == 'linear':
                num -= 1  # 'Linear' do nothing
            else:
                raise Exception('Not supported activation: {0}'.format(h))
            num += 1
            in_feat = out_feat
        self.main = main

        # initialize weights
        for m in self.modules():
            if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
                m.weight.data.normal_(0.0, self.std)
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.normal_(1.0, self.std)
                m.bias.data.zero_()

    def forward(self, input):
        assert input.size(2) == self.input_size,\
            'Wrong input size: {0}. Expected {1}'.format(input.size(2),
                                                         self.input_size)
        if self.ngpu > 1 and isinstance(input.data, torch.cuda.FloatTensor):
            gpu_ids = range(self.ngpu)
            output = nn.parallel.data_parallel(self.main, input, gpu_ids)
        else:
            output = self.main(input)
        return output

它的調用模式：

def create_svhn_gz(nz=256, ngpu=1):
    hparams = [
        # op // kernel // strides // fmaps // conv. bias // batch_norm // dropout // nonlinearity
        ['conv2d', 5, 1,   32, False, True, 0.0, 'leaky_relu'],
        ['conv2d', 4, 2,   64, False, True, 0.0, 'leaky_relu'],
        ['conv2d', 4, 1,  128, False, True, 0.0, 'leaky_relu'],
        ['conv2d', 4, 2,  256, False, True, 0.0, 'leaky_relu'],
        ['conv2d', 4, 1,  512, False, True, 0.0, 'leaky_relu'],
        ['conv2d', 1, 1,  512, False, True, 0.0, 'leaky_relu'],
        ['conv2d', 1, 1, 2*nz, True, False, 0.0, 'linear'],
    ]
    return CNN(3, 32, hparams, ngpu)

裝配模式完成模型構建、參數設置和和參數的初始化。

6、對模型中不同參數進行不同的優化

在GAN中，生成器與判別器的參數不是同時優化的，需要交替進行，我們使用torch.optim實現了類似spring的橫切功能：
A）定義兩個optim，分別掌管不同的模型參數

# setup optimizer
dis_params = chain(netDx.parameters(), netDz.parameters(), netDxz.parameters())
gen_params = chain(netGx.parameters(), netGz.parameters())

kwargs_adam = {'lr': opt.lr, 'betas': (opt.beta1, opt.beta2)}
optimizerD = optim.Adam(dis_params, **kwargs_adam)
optimizerG = optim.Adam(gen_params, **kwargs_adam)

B）交替調用優化器

D_loss = compute_loss(batch_size, d_loss=True)
G_loss = compute_loss(batch_size, d_loss=False)

for p in netGx.parameters():
    p.requires_grad = False  # to avoid computation
for p in netGz.parameters():
    p.requires_grad = False  # to avoid computation
for p in netDx.parameters():
    p.requires_grad = True  # to avoid computation
for p in netDz.parameters():
    p.requires_grad = True  # to avoid computation
for p in netDxz.parameters():
    p.requires_grad = True  # to avoid computation

optimizerD.zero_grad()
D_loss.backward()
optimizerD.step()  # Apply optimization step

for p in netGx.parameters():
    p.requires_grad = True  # to avoid computation
for p in netGz.parameters():
    p.requires_grad = True  # to avoid computation
for p in netDx.parameters():
    p.requires_grad = False  # to avoid computation
for p in netDz.parameters():
    p.requires_grad = False  # to avoid computation
for p in netDxz.parameters():
    p.requires_grad = False  # to avoid computation

optimizerG.zero_grad()
G_loss.backward()
optimizerG.step()  # Apply optimization step

Pytorch構建模型的技巧小結（1）

1、保存模型參數和加載模型參數

2、模型參數的鉗位

3、模型的CUDA化

4、dataloader的逐次迭代

5、構建一個模型的裝配模式

6、對模型中不同參數進行不同的優化

《Python進階》學習筆記

Leetcode 3161. 物塊放置查詢

一個docker容器暴露多個端口

leetcode 60 排列序列

微服務實踐之使用 Visual Studio 2022 調試Dapr 應用程序

wpf附加屬性理解 WPF附加屬性

《A Closed-Form Solution to Natural Image Matting》的一句話的理解

從一個菜鳥開始學習機器學習

GAN的Loss的比較研究（3）——Wasserstein Loss理解（1）

GAN的Loss的比較研究（2）——傳統GAN的Loss的理解2

ALI比GAN的優勢在哪裏？

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結