PyToch：基於神經網絡的數字識別（MNIST數據集）

背景

最近在學習PyTorch和深度學習，所以決定先用MNIST數據集試試手，利用神經網絡做一個簡單的數字識別。
參考代碼來源於：
https://github.com/udacity/deep-learning-v2-pytorch/tree/master/intro-to-pytorch
需要注意的是本文的神經網絡中只使用了全鏈接層，只是爲了理解PyTorch而進行的一個非常簡單的實驗。
以下的代碼內容包括train.py和predict.py，後邊就直接邊貼代碼邊解釋。

代碼鏈接：https://github.com/Yannnnnnnnnnnn/learnPyTorch/tree/master/trainMNIST/fulljion

一、訓練

train_CPU.py

# 加載訓練需要的模塊
import torch
from torch import nn
from torch import optim
from torchvision import datasets, transforms

# transform模塊，主要作用是將輸入數據轉換成tenssor，並進行歸一化
transform = transforms.Compose([transforms.ToTensor(),
                                transforms.Normalize((0.5,), (0.5,)),
                              ])

# MNIST數據 train,每一個batch的大小爲128
trainset = datasets.MNIST('./MNIST_data/', download=True, train=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True)

# MNIST數據 test,每一個batch的大小爲128
testset = datasets.MNIST('./MNIST_data/', download=True, train=False, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=128, shuffle=True)


# 本次實驗所使用的模型，只有全鏈接層，結構非常簡單，沒有什麼特殊的地方
model = nn.Sequential(nn.Linear(784, 392),
                      nn.ReLU(),
                      nn.Linear(392, 128),
                      nn.ReLU(),
                      nn.Linear(128, 64),
                      nn.ReLU(),
                      nn.Linear(64, 10),
                      nn.LogSoftmax(dim=1))

# 定義一下損失函數和優化器
criterion = nn.NLLLoss()
optimizer = optim.SGD(model.parameters(), lr=0.003)

# 迭代次數
epochs = 30

# 記錄每次的損失
train_losses, test_losses = [], []

# 訓練
for e in range(epochs):
    running_loss = 0
    # 讀取所有的訓練數據，並進行訓練
    # images和labels都是維度爲128的tensor
    for images, labels in trainloader:
        
        # 將images的size更改一下，變成一條向量
        images = images.view(images.shape[0], -1)
    
        # 清除上一次自動求導的梯度信息
        optimizer.zero_grad()
        
        # forward過程
        output = model(images)
        
        # 計算損失
        loss = criterion(output, labels)
        
        # backward 
        # 此過程中會自動求導
        loss.backward()
        
        # 更新參數
        optimizer.step()
        
        running_loss += loss.item()
    else:
        test_loss = 0
        accuracy = 0
        
        # 利用test數據進行測試
        # 爲提高預測的速度，最好關閉梯度計算
        with torch.no_grad():
            for images, labels in testloader:
                
                # 更改images的size
                images = images.view(images.shape[0], -1)
                
                # 預測
                log_ps = model(images)
                
                # 計算損失
                test_loss += criterion(log_ps, labels)
                
                # 由於在pytorch中最終的預測結果都進行了求對數
                # 所以這這裏又添加了一個求指數
                ps = torch.exp(log_ps)
                
                # 獲取最好的結果
                top_p, top_class = ps.topk(1, dim=1)
                
                # 計算精度
                equals = top_class == labels.view(*top_class.shape)
                accuracy += torch.mean(equals.type(torch.FloatTensor))
                
        train_losses.append(running_loss/len(trainloader))
        test_losses.append(test_loss/len(testloader))

        # 在每一個epoch中都保存一次模型
        torch.save(model.state_dict(), str(e) +'.pth')

        print("Epoch: {}/{}.. ".format( e+1, epochs),
              "Training Loss: {:.3f}.. ".format(running_loss/len(trainloader)),
              "Test Loss: {:.3f}.. ".format(test_loss/len(testloader)),
              "Test Accuracy: {:.3f}".format(accuracy/len(testloader)))
      
# 畫一下最終的精度圖        
import matplotlib.pyplot as plt
plt.plot(train_losses, label='Training loss')
plt.plot(test_losses, label='Validation loss')
plt.legend(frameon=False)

以上代碼都是在CPU上運行，如果要更改成gpu也非常簡單，以下僅貼出了需要修改的區域


# 本次實驗所使用的模型，只有全鏈接層，結構非常簡單，沒有什麼特殊的地方
model = nn.Sequential(nn.Linear(784, 392),
                      nn.ReLU(),
                      nn.Linear(392, 128),
                      nn.ReLU(),
                      nn.Linear(128, 64),
                      nn.ReLU(),
                      nn.Linear(64, 10),
                      nn.LogSoftmax(dim=1))
model.cuda() # GPU

# 訓練
        # 將images的size更改一下，變成一條向量
        images = images.view(images.shape[0], -1)
        
        # CUDA
        images = images.cuda()
        labels = labels.cuda()
    
 
    else:

     
        # 利用test數據進行測試
        # 爲提高預測的速度，最好關閉梯度計算
        with torch.no_grad():
            for images, labels in testloader:
                
                # 更改images的size
                images = images.view(images.shape[0], -1)
                
                # CUDA
                images = images.cuda()
                labels = labels.cuda()

利用以上代碼就可以進行訓練，最後經過30次迭代後迭代情況如下圖所示，最後的validation精度爲92.8%

二、預測

完成訓練後，我分別用MNIST中的幾個數字和自己用畫圖板寫了幾個數字進行測試。
MNIST的數據如下：

自己畫的數據如下：

以下是預測所使用的代碼
predict.py

import torch
from torch import nn
from torchvision import transforms

transform = transforms.Compose([transforms.ToTensor(),
                                transforms.Normalize((0.5,), (0.5,)),
                              ])


model = nn.Sequential(nn.Linear(784, 392),
                      nn.ReLU(),
                      nn.Linear(392, 128),
                      nn.ReLU(),
                      nn.Linear(128, 64),
                      nn.ReLU(),
                      nn.Linear(64, 10),
                      nn.LogSoftmax(dim=1))



# load model
state_dict = torch.load('29.pth')
model.load_state_dict(state_dict)


import numpy as np
from PIL import Image
import matplotlib.pyplot as plt

# 打開圖片並轉成灰度
image = Image.open('1.png')
gray=image.convert('L')

# 顯示圖片
plt.figure("predict")
plt.imshow(gray)
plt.show()


# 轉成tensor
tensor = transform(gray)
tensor = tensor.view(1, 784)
inputdata = torch.autograd.Variable(tensor,requires_grad=False)
outputdata = model(inputdata)
ps = torch.exp(outputdata)

top_p, top_class = ps.topk(1, dim=1)

# 輸出結果
print(top_p)

最後在MNIST數據上實驗結果如下：

0(0.9987) 1(0.9855) 2(0.9785) 5(0.7746) 9(0.9116)

在自己畫的數據上的實驗結果如下，其中第4個數字7被錯誤分成了2。想來全鏈接還是太簡單了

1(0.5416) 3(0.4719) 4(0.6279) 2(0.5658) 8(0.7286)

三、總結

本博客只是簡單的嘗試了一下，並且只使用了全鏈接，所以結果並不是很好。以後嘗試加入卷積層，效果應該會更好。

PyToch：基於神經網絡的數字識別（MNIST數據集）

背景

一、訓練

二、預測

三、總結

Python 潮流週刊#52：Python 處理 Excel 的資源

PyTorch：學習conv1D,conv2D和conv3D

Pytorch : Run FlowNet2 with Pytorch

讀後感--《魔鬼數學：大數據時代，數學思維的力量》

偷懶性開發：gitblid+jenkins持續性開發與集成

PyTorch(1.3.0+)：學習torch.nn.functional.grid_sample

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結