訓練分類器

2019年年初，ApacheCN組織志願者翻譯了PyTorch1.0版本中文文檔（github地址），同時也獲得了PyTorch官方授權，我相信已經有許多人在中文文檔官網上看到了。不過目前校對還缺人手，希望大家踊躍參與。之前一段時間我們和PyTorch的有關負責人Bruce Lin一直在進行郵件交流。在之後適當的時候，我們會組織志願者進行其他有關PyTorch的項目，歡迎大家加入我們，關注我們。更希望我們的一系列工作能夠對大家有所幫助。

譯者：bat67

校對者：FontTian

目前爲止，我們以及看到了如何定義網絡，計算損失，並更新網絡的權重。所以你現在可能會想,

數據應該怎麼辦呢？

通常來說，當必須處理圖像、文本、音頻或視頻數據時，可以使用python標準庫將數據加載到numpy數組裏。然後將這個數組轉化成torch.*Tensor。

對於圖片，有Pillow，OpenCV等包可以使用
對於音頻，有scipy和librosa等包可以使用
對於文本，不管是原生python的或者是基於Cython的文本，可以使用NLTK和SpaCy

特別對於視覺方面，我們創建了一個包，名字叫torchvision，其中包含了針對Imagenet、CIFAR10、MNIST等常用數據集的數據加載器（data loaders），還有對圖片數據變形的操作，即torchvision.datasets和torch.utils.data.DataLoader。

這提供了極大的便利，可以避免編寫樣板代碼。

在這個教程中，我們將使用CIFAR10數據集，它有如下的分類：“飛機”，“汽車”，“鳥”，“貓”，“鹿”，“狗”，“青蛙”，“馬”，“船”，“卡車”等。在CIFAR-10裏面的圖片數據大小是3x32x32，即三通道彩色圖，圖片大小是32x32像素。

訓練一個圖片分類器

我們將按順序做以下步驟：

通過torchvision加載CIFAR10裏面的訓練和測試數據集，並對數據進行標準化
定義卷積神經網絡
定義損失函數
利用訓練數據訓練網絡
利用測試數據測試網絡

1.加載並標準化CIFAR10

使用torchvision加載CIFAR10超級簡單。

import torch
import torchvision
import torchvision.transforms as transforms

torchvision數據集加載完後的輸出是範圍在[0, 1]之間的PILImage。我們將其標準化爲範圍在[-1, 1]之間的張量。

transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4, shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

輸出：

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz
Files already downloaded and verified

樂趣所致，現在讓我們可視化部分訓練數據。

import matplotlib.pyplot as plt
import numpy as np

# 輸出圖像的函數


def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.show()


# 隨機獲取訓練圖片
dataiter = iter(trainloader)
images, labels = dataiter.next()

# 顯示圖片
imshow(torchvision.utils.make_grid(images))
# 打印圖片標籤
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))

輸出：

horse horse horse   car

2.定義卷積神經網絡

將之前神經網絡章節定義的神經網絡拿過來，並將其修改成輸入爲3通道圖像（替代原來定義的單通道圖像）。

import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()

3.定義損失函數和優化器

我們使用分類的交叉熵損失和隨機梯度下降（使用momentum）。

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

4.訓練網絡

事情開始變得有趣了。我們只需要遍歷我們的數據迭代器，並將輸入“喂”給網絡和優化函數。

for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

輸出：

[1,  2000] loss: 2.182
[1,  4000] loss: 1.819
[1,  6000] loss: 1.648
[1,  8000] loss: 1.569
[1, 10000] loss: 1.511
[1, 12000] loss: 1.473
[2,  2000] loss: 1.414
[2,  4000] loss: 1.365
[2,  6000] loss: 1.358
[2,  8000] loss: 1.322
[2, 10000] loss: 1.298
[2, 12000] loss: 1.282
Finished Training

5.使用測試數據測試網絡

我們已經在訓練集上訓練了2遍網絡。但是我們需要檢查網絡是否學到了一些東西。

我們將通過預測神經網絡輸出的標籤來檢查這個問題，並和正確樣本進行（ground-truth）對比。如果預測是正確的，我們將樣本添加到正確預測的列表中。

ok，第一步。讓我們顯示測試集中的圖像來熟悉一下。

dataiter = iter(testloader)
images, labels = dataiter.next()

# 輸出圖片
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))

GroundTruth:    cat  ship  ship plane

ok，現在讓我們看看神經網絡認爲上面的例子是:

outputs = net(images)

輸出是10個類別的量值。一個類的值越高，網絡就越認爲這個圖像屬於這個特定的類。讓我們得到最高量值的下標/索引；

_, predicted = torch.max(outputs, 1)

print('Predicted: ', ' '.join('%5s' % classes[predicted[j]] for j in range(4)))

輸出：

Predicted:    dog  ship  ship plane

結果還不錯。

讓我們看看網絡在整個數據集上表現的怎麼樣。

correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))

輸出：

Accuracy of the network on the 10000 test images: 55 %

這比隨機選取（即從10個類中隨機選擇一個類，正確率是10%）要好很多。看來網絡確實學到了一些東西。

那麼哪些是表現好的類呢？哪些是表現的差的類呢？

class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))

輸出：

Accuracy of plane : 70 %
Accuracy of   car : 70 %
Accuracy of  bird : 28 %
Accuracy of   cat : 25 %
Accuracy of  deer : 37 %
Accuracy of   dog : 60 %
Accuracy of  frog : 66 %
Accuracy of horse : 62 %
Accuracy of  ship : 69 %
Accuracy of truck : 61 %

ok，接下來呢？

怎麼在GPU上運行神經網絡呢？

在GPU上訓練

與將一個張量傳遞給GPU一樣，可以這樣將神經網絡轉移到GPU上。

如果我們有cuda可用的話，讓我們首先定義第一個設備爲可見cuda設備：

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Assuming that we are on a CUDA machine, this should print a CUDA device:

print(device)

輸出：

cuda:0

本節的其餘部分假設device是CUDA。

然後這些方法將遞歸遍歷所有模塊，並將它們的參數和緩衝區轉換爲CUDA張量：

net.to(device)

請記住，我們不得不將輸入和目標在每一步都送入GPU：

inputs, labels = inputs.to(device), labels.to(device)

爲什麼我們感受不到與CPU相比的巨大加速？因爲我們的網絡實在是太小了。

嘗試一下：加寬你的網絡（注意第一個nn.Conv2d的第二個參數和第二個nn.Conv2d的第一個參數要相同），看看能獲得多少加速。

已實現的目標：

在更高層次上理解PyTorch的Tensor庫和神經網絡
訓練一個小的神經網絡做圖片分類

在多GPU上訓練

如果希望使用您所有GPU獲得更大的加速，請查看Optional: Data Parallelism。

接下來要做什麼？

Train neural nets to play video games
Train a state-of-the-art ResNet network on imagenet
Train a face generator using Generative Adversarial Networks
Train a word-level language model using Recurrent LSTM networks
More examples
More tutorials
Discuss PyTorch on the Forums
Chat with other users on Slack
l using Recurrent LSTM networks](https://github.com/pytorch/examples/tree/master/word_language_model)
More examples
More tutorials
Discuss PyTorch on the Forums
Chat with other users on Slack

使用PyTorch訓練圖像分類器

訓練分類器

數據應該怎麼辦呢？

訓練一個圖片分類器

1.加載並標準化CIFAR10

2.定義卷積神經網絡

3.定義損失函數和優化器

4.訓練網絡

5.使用測試數據測試網絡

在GPU上訓練

在多GPU上訓練

接下來要做什麼？

python gdal 安裝使用（Windows， python 3.6.8）

opencv快速入門人臉檢測與人臉識別

發現你的身形——OpenCV圖像輪廓

Autograd：自動求導

使用PyTorch訓練圖像分類器

PyTorch 深度學習: 60 分鐘極速入門

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結