pytorch 可以说是深度学习入门首选的框架,语法特点特别接近numpy,上手简单。作为一门流行的框架,总有它流行的原因,笔者认为这是pytorch框架的一些特色所决定的,以下内容来源笔者在入门学习中的体会,因此作文总结。
近期我简单入门了一下深度学习,对 pytorch 有了一定的掌握和认识,不得不感慨 pytorch 大法好,对深度学习新手特别友好,和numpy有着相似的语法特点,但有一些专门为深度学习设计的框架特色,以下结合个人所学和体会,得出以下两大特色:
- 声明张量矩阵,可以自动计算梯度,省去了许多计算代码;
- 快速搭建神经网络,提供简单易懂的模型算法接口
自动计算梯度
在 pytorch 中矩阵变量是以张量(tensor)来声明定义的,我们可以在声明张量的时候,决定该变量是否自动计算梯度。
以下给出简单的例子。
简单的线性模型 - 1
import torch
import torchvision
import torch.nn as nn
import numpy as np
import torchvision.transforms as transforms
x = torch.tensor(1., requires_grad = True)
w = torch.tensor(2., requires_grad = True)
b = torch.tensor(3., requires_grad = True)
print(x, w, b)
y = w * x + b
print(y)
print(x.grad)
print(w.grad)
print(b.grad)
# compute gradient
y.backward()
print(x.grad)
print(w.grad)
print(b.grad)
输出结果如下:
简单线性模型 - 2
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
# Hyper-parameters
input_size = 1
output_size = 1
num_epochs = 60
learning_rate = 0.001
# Toy dataset
x_train = np.array([[3.3], [4.4], [5.5], [6.71], [6.93], [4.168],
[9.779], [6.182], [7.59], [2.167], [7.042],
[10.791], [5.313], [7.997], [3.1]], dtype=np.float32)
y_train = np.array([[1.7], [2.76], [2.09], [3.19], [1.694], [1.573],
[3.366], [2.596], [2.53], [1.221], [2.827],
[3.465], [1.65], [2.904], [1.3]], dtype=np.float32)
# Linear regression model
model = nn.Linear(input_size, output_size)
# Define the loss function
criterion = nn.MSELoss()
# Define the optimiter to solve
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
# Train the model
for epoch in range(num_epochs):
# Convert numpy arrays to torch tensors
inputs = torch.from_numpy(x_train)
targets = torch.from_numpy(y_train)
# Forward pass
outputs = model(inputs)
# calculate loss fucntion value.
loss = criterion(outputs, targets)
# Backward and optimize
optimizer.zero_grad()
loss.backward()
# update the parameter
optimizer.step()
if (epoch+1) % 5 == 0:
print ('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, num_epochs, loss.item()))
# Plot the graph
predicted = model(torch.from_numpy(x_train)).detach().numpy()
plt.plot(x_train, y_train, 'ro', label='Original data')
plt.plot(x_train, predicted, label='Fitted line')
plt.legend()
plt.show()
# Save the model checkpoint
torch.save(model.state_dict(), 'model.ckpt')
快速搭建神经网络
pytorch 提供的nn模块可以帮助我们很快定义好一个网络结构,主要有几个步骤:
- 定义模型,损失函数,优化求解方法
- 开始训练,输出预测结果,计算损失函数值,反向更新参数
- 直到迭代结束,模型训练成功。
以下给出一个简单的三层神经网络结构。
import torch
import matplotlib.pyplot as plt
# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10
# Create random Tensors to hold inputs and outputs
x = torch.randn(N, D_in)
y = torch.randn(N, D_out)
# Use the nn package to define our model and loss function.
model = torch.nn.Sequential(
torch.nn.Linear(D_in, H),
torch.nn.ReLU(),
torch.nn.Linear(H, D_out),
)
loss_fn = torch.nn.MSELoss(reduction='sum')
# Use the optim package to define an Optimizer that will update the weights of
# the model for us. Here we will use Adam; the optim package contains many other
# optimization algoriths. The first argument to the Adam constructor tells the
# optimizer which Tensors it should update.
learn_rate = 1e-4
optimizer = torch.optim.Adam(model.parameters(), lr = learn_rate)
t_list = []
loss_list = []
for t in range(500):
# output the predicted value
y_pred = model(x)
# calculate loss function value
loss = loss_fn(y_pred, y)
# for visualization
t_list.append(t)
loss_list.append(loss)
# make the optimizer's gradient to be zero
optimizer.zero_grad()
# calculate gradient and update parameter
loss.backward()
optimizer.step()
plt.plot(t_list, loss_list, label = 'loss')
plt.show()
随迭代次数的增加,损失函数值逐渐变小。
pytorch 的魅力远不止如此,本菜以后更熟练的时候在总结!