今天我們來講一篇入門級必做的項目,如何使用pytorch進行CIFAR10分類,即利用CIFAR10數據集訓練一個簡單的圖片分類器。
首先,瞭解一下CIFAR10數據集:
數據集:The CIFAR-10 and CIFAR-100標記爲8000萬微型圖片
收集者: Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.
格式:10類共60000張32*32的圖片,每個類別大約6000 張圖片,其中訓練集50000張,測試集10000張。
可視化觀察一下:
我們今天要做的就是如何訓練一個神經網絡模型,使得輸入一張CIFAR中的圖片,會輸出預測的類別(10個類別之一)。
一、總體步驟:
步驟1:使用torchvision來加載和標準化CIFAR10訓練和測試數據集 步驟2:使用pytorch框架定義一個卷積神經網絡CNN 步驟3:定義一個損失函數 步驟4:在訓練數據集上訓練網絡 步驟5:在測試數據集上測試網絡 步驟6:在不同的類上測試網絡
二、重點問題:
1、如何下載數據:
使用:torchvision.datasets.CIFAR10和torch.utils.data.DataLoader下載數據並加載。
train_data = torchvision.datasets.CIFAR10(root='./CIFAR10data', train=True, download=False, transform=transform) train_loader = torch.utils.data.DataLoader(train_data, batch_size=4, shuffle=True, num_workers=2)
2、定義神經網絡
必須有的繼承:
class Net(nn.Module): def __init__(self): super(Net,self).__init__()
卷積層與全連接層直接需要拉成向量;
對於各層,先定義後使用:conv–>relu–>pool
3、定義損失函數與優化器:
criterion = nn.CrossEntropyLoss() optimzer = optim.SGD(net.parameters(), lr = 0.001, momentum = 0.9)
4、訓練網絡
輸入–>Variable–>net–>loss,optimzer–>Loss
5、預測、測試網絡
傳入測試數據集,按訓練步驟預測
correct += (pred == labels).sum()
6、分類測試
_, pred = torch.max(outputs.data,1) c = (pred == labels).squeeze() # 1*10000*10-->10*10000
三、整體代碼:
(1)導入需要的包
# -*- coding: utf-8 -*- import torch import torchvision import torchvision.transforms as transforms from torch.autograd import Variable import torch.nn as nn import torch.nn.functional as F import torch.optim as optim import matplotlib.pyplot as plt import numpy as np
(2)導入數據並進行標準化處理,轉換成需要的格式
ToTensor:導入的數據是PILImage圖片格式,需要轉換爲tensor
Normalize: 將圖片數據轉化爲 [-1, 1]範圍,而不是初始的[0,1]
transform = transforms.Compose( [ transforms.ToTensor(), transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))])
(3)下載數據
train_data = torchvision.datasets.CIFAR10(root='./CIFAR10data', train=True, download=False, transform=transform) train_loader = torch.utils.data.DataLoader(train_data, batch_size=4, shuffle=True, num_workers=2) test_data = torchvision.datasets.CIFAR10(root='./CIFAR10data', train=False, download=False, transform=transform) test_loader = torch.utils.data.DataLoader(test_data, batch_size=4, shuffle=False, num_workers=2)
問題:爲什麼test_loader的shuffle=false,但是train_loader的shuffle=true
因爲:shuffle的作用是打亂數據的順序,train中達到抽取的作用,test時因爲測試一般是將所有測試數據跑一遍,不需要打亂順序
(4)展示圖片
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog','frog','horse','ship', 'truck') def imshow(img): img = img / 2 + 0.5 # unnormalize npimg = img.numpy() # np.transpose:按需求轉置 plt.imshow(np.transpose(npimg, (1, 2, 0)))
(5)定義卷積神經網絡模型
class Net(nn.Module): def __init__(self): super(Net,self).__init__() self.conv1 = nn.Conv2d(3,6,5) self.pool = nn.MaxPool2d(2,2) self.conv2 = nn.Conv2d(6,16,5) self.fc1 = nn.Linear(16*5*5,120) self.fc2 = nn.Linear(120,84) self.fc3 = nn.Linear(84,10) def forward(self,x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1,16*5*5) # 拉成向量 x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = F.relu(self.fc3(x)) return x net = Net()
(6)定義loss函數和優化器
criterion = nn.CrossEntropyLoss() optimzer = optim.SGD(net.parameters(), lr = 0.001, momentum = 0.9) # SGD(傳入參數,定義lr,動量)
(7)訓練網絡
for epoch in range(1): running_loss = 0.0 # 0 用於指定索引起始值 for i, data in enumerate(train_loader,0): input, target = data input, target = Variable(input),Variable(target) optimzer.zero_grad() output = net(input) loss = criterion(output,target) # output 和 target 的交叉熵損失 loss.backward() optimzer.step() # 問題:這裏的loss.data[0],爲什麼不是loss.data() # 這裏的loss是torch.cuda.tensor類型數據,使用loss.data[0]提取其中數據 running_loss += loss.data[0] if i % 2000 ==1999: # print every 2000 mini_batches,1999,because of index from 0 on print ('[%d,%5d]loss:%.3f' % (epoch+1,i+1,running_loss/2000)) running_loss = 0.0 print('Finished Training')
輸出:
''' [1, 2000] loss: 2.252 [1, 4000] loss: 1.894 [1, 6000] loss: 1.677 [1, 8000] loss: 1.597 '''
(8)測試網絡
dataiter = iter(test_loader) images,labels = dataiter.next() imshow(torchvision.utils.make_grid(images)) print('GroundTruth:',' '.join('%5s' % classes[labels[j]] for j in range(4))) outputs = net(Variable(images)) _, pred = torch.max(outputs.data,1) print('Predicted: ', ' '.join('%5s' % classes[pred[j][0]] for j in range(4))) correct = 0.0 total = 0 for data in test_loader: images,labels = data outputs = net(Variable(images)) _, pred = torch.max(outputs.data,1) total += labels.size(0) correct += (pred == labels).sum() print('Accuracy of the network on the 10000 test images : %d %%' % (100 * correct / total))
(9)分析結果:什麼類別分類的效果好,什麼類別的不好
class_correct = list(0. for i in range(10)) class_total = list(0. for i in range(10)) for data in test_loader: images, labels = data outputs = net(Variable(images)) _, pred = torch.max(outputs.data,1) c = (pred == labels).squeeze() # 1*10000*10-->10*10000 for i in range(4): label = labels[i] class_correct[label] += c[i] class_total[label] += 1 for i in range(10): print('Accuracy of %5s : %2d %%' %(classes[i],100 * class_correct[i]/class_total[i]))
這個小項目就到這裏啦,看了之後還要自己動手操作一下,看看結果哦!