Deep Learning with Pytorch-Train A Classifier

西门骁

2023-12-01

Deep Learning with Pytorch: A 60 Minute Blitz

训练一个分类器

我们已经看到了如何定义一个神经网络,计算代价值和更新这个网络的权值,你现在可能会想,

数据呢?

通常,当你处理图片、文本、声音或者视频数据的时候，你使用标准python package加载数据到 numpy array 的 python 包，然后你把array 转换成 torch.*Tensor

特别的，对于图像，我们创造了一个名为 torchvision 的包，torchvision可直接加载如Imagenet, CIFAR10, MNIST之类的常用数据集，还有一些非常常用的数据转换器，这提供了巨大的方便，避免了范例文件代码的编写

本教程我们将使用CIFAR10数据集。共有十类： ‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’. CIFAR10中的图片3通道，32*32大小

训练一个图片分类器

我们将依次执行以下步骤

1.使用torchvision加载并规格化CIFAR10的training和testing数据集

2.定义一个卷积神经网络CNN

3.定义代价函数 loss function

4.在training data上训练神经网络

5.在testing data 上测试神经网络

1.加载并规格化CIFAR10

使用torchvision,加载CIFAR10 so easy,(妈妈再也不用担心我的学习…)

import torch
import torchvision
import torchvision.transforms as tfs


# torchvision 数据集的输出是[0, 1]范围的PILImage图片
# 我们使用归一化方法将其转化为[-1, 1]范围内的Tensor

import torch
import torchvision
import torchvision.transforms as tfs

transform = tfs.Compose([tfs.ToTensor(), 
                        tfs.Normalize((0.5, 0.5, 0.5),(0.5, 0.5, 0.5))])

trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=Flase,download=True, transform=transform)

testloader = torch.utils.data.DataLoader(testset, batch_size=4, shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat', 'deer',
          'dog', 'frog', 'horse', 'ship', 'truck')

让我们看一下一些训练图片

import matplotlib.pyplot as plt
import numpy as np

# functions to show an image


def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))


# get some random training images
dataiter = iter(trainloader)
images, labels = dataiter.next()

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))

2.定义一个卷积神经网络

import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5) # 3 input image channel, 6 output channels, 5x5 square convolution kernel
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1   = nn.Linear(16*5*5, 120) # an affine operation: y = Wx + b
        self.fc2   = nn.Linear(120, 84)
        self.fc3   = nn.Linear(84, 10)

    def forward(self, x):
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2)) # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv2(x)), 2) # If the size is a square you can only specify a single number
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:] # all dimensions except the batch dimension
        num_features = 1
        for s in size:
            num_features *= s
        return num_features

net = Net()

3. 定义代价函数 ( Loss Function ) 和优化器 ( Optimizer )

使用 Classification Cross-Entropy 和 SGD

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

4. 训练这个网络

现在事情开始变得有趣，我们只需一遍一遍地迭代数据并将数据输入网络来优化即可。

for epoch in range(2):  # Loop over the data set multiple times
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statoistics
        running_loss += loss.item()
        if i % 2000 == 1999:  # print every 2000 mini-batches
            print('[%d, %d] loss: %.3f' % 
                 (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training.')

5. 在Testing Data上测试网络

我们已经在training data上训练两遍网络，但是我们需要检查网络是否学到了什么没有

我们通过比较网络输出的类标签和Ground-Truth比较来检查网络，如果预测正确，我们就把样本加入正确预测的列表中

第一步，先展示一下从testing set获得的一些照片

dataiter = iter(testloader)
images, labels = dataiter.next()

# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))


# GroundTruth:    cat  ship  ship plane

现在让我们看看神经网络认为上面的样本是什么

outputs = net(Variable(images))

# the outputs are energies for the 10 classes.
# Higher the energy for a class, the more the network
# thinks that the image is of the particular class

# So, Let's get the index of the highest energy
_, predicted = torch.max(outputs.data, 1)
print('print')

# 训练结果
[1, 2000] loss: 2.195
[1, 4000] loss: 1.789
[1, 6000] loss: 1.633
[1, 8000] loss: 1.534
[1, 10000] loss: 1.511
[1, 12000] loss: 1.433
[2, 2000] loss: 1.387
[2, 4000] loss: 1.368
[2, 6000] loss: 1.338
[2, 8000] loss: 1.307
[2, 10000] loss: 1.273
[2, 12000] loss: 1.281
Finished Training.
predicted:  horse  bird plane truck

训练的结果非常好
让我们看一下网络在整个testing data上表现如何

corret = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, lables = data
        outputs = net(images)
        _, predicts = torch.max(outputs.data, 1)
        total += labels.size(0)
        corret += (predicted == labels).sum().iterm()

print('Accuracy of the network on the 10000 test images: %d %%'
     % 100 * corret / total)

训练的结果要比随机好, 要从十个中选择一个的话准确率大概只有10%

那么它究竟在哪些类别表现良好, 哪些类别表现不好呢?

class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
for data in testloader:
    images, labels = data
    outputs = net(Variable(images))
    _, predicted = torch.max(outputs.data, 1)
    c = (predicted == labels).squeeze()
    for i in range(4):
        label = labels[i]
        class_correct[label] += c[i].item()
        class_total[label] += 1

for i in range(10):
    prit('Accucary of %5s : %2d %%' %
        classes[i], 100 * class_correct[i]/class_total[i])