LeNet網絡在手寫數字識別上的應用(paddle)

寫在前面: 我是「虐貓人薛定諤i」,一個不滿足於現狀,有夢想,有追求的00後
\quad
本博客主要記錄和分享自己畢生所學的知識,歡迎關注,第一時間獲取更新。
\quad
不忘初心,方得始終。
\quad

❤❤❤❤❤❤❤❤❤❤


在這裏插入圖片描述

LeNet簡介

LeNet是最早的卷積神經網絡之一,其網絡結構如下圖所示
在這裏插入圖片描述
從圖中我們可以看到,該網絡包含3個卷積層,2個池化層和2個全連接層,LeNet通過連續使用卷積層和全連接層來提取圖像的特徵,進而達到識別圖像的目的。

代碼

LeNet網絡的實現代碼如下

class LeNet(fluid.dygraph.Layer):
    def __init__(self, name_scope, num_classes=1):
        super(LeNet, self).__init__(name_scope)

        self.conv1 = Conv2D(num_channels=1,
                            num_filters=6,
                            filter_size=5,
                            act='sigmoid')
        self.pool1 = Pool2D(pool_size=2, pool_stride=2, pool_type='max')
        self.conv2 = Conv2D(num_channels=6,
                            num_filters=16,
                            filter_size=5,
                            act='sigmoid')
        self.pool2 = Pool2D(pool_size=2, pool_stride=2, pool_type='max')
        self.conv3 = Conv2D(num_channels=16,
                            num_filters=120,
                            filter_size=4,
                            act='sigmoid')
        self.fc1 = Linear(input_dim=120, output_dim=64, act='sigmoid')
        self.fc2 = Linear(input_dim=64, output_dim=num_classes)

    def forward(self, x):
        x = self.conv1(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = self.pool2(x)
        x = self.conv3(x)
        x = fluid.layers.reshape(x, [x.shape[0], -1])
        x = self.fc1(x)
        x = self.fc2(x)
        return x

完整代碼如下

import paddle
import paddle.fluid as fluid
import numpy as np
from paddle.fluid.dygraph.nn import Conv2D, Pool2D, Linear
"""
LeNet在手寫數字識別上的應用
"""


class LeNet(fluid.dygraph.Layer):
    def __init__(self, name_scope, num_classes=1):
        super(LeNet, self).__init__(name_scope)

        self.conv1 = Conv2D(num_channels=1,
                            num_filters=6,
                            filter_size=5,
                            act='sigmoid')
        self.pool1 = Pool2D(pool_size=2, pool_stride=2, pool_type='max')
        self.conv2 = Conv2D(num_channels=6,
                            num_filters=16,
                            filter_size=5,
                            act='sigmoid')
        self.pool2 = Pool2D(pool_size=2, pool_stride=2, pool_type='max')
        self.conv3 = Conv2D(num_channels=16,
                            num_filters=120,
                            filter_size=4,
                            act='sigmoid')
        self.fc1 = Linear(input_dim=120, output_dim=64, act='sigmoid')
        self.fc2 = Linear(input_dim=64, output_dim=num_classes)

    def forward(self, x):
        x = self.conv1(x)
        x = self.pool1(x)
        x = self.conv2(x)
        x = self.pool2(x)
        x = self.conv3(x)
        x = fluid.layers.reshape(x, [x.shape[0], -1])
        x = self.fc1(x)
        x = self.fc2(x)
        return x


def train(model):
    print("start training...")
    model.train()
    epoch_num = 5
    opt = fluid.optimizer.Momentum(learning_rate=0.001,
                                   momentum=0.9,
                                   parameter_list=model.parameters())
    train_loader = paddle.batch(paddle.dataset.mnist.train(), batch_size=10)
    valid_loader = paddle.batch(paddle.dataset.mnist.test(), batch_size=10)
    for epoch in range(epoch_num):
        for batch_id, data in enumerate(train_loader()):
            x_data = np.array([item[0] for item in data],
                              dtype='float32').reshape(-1, 1, 28, 28)
            y_data = np.array([item[1] for item in data],
                              dtype='int64').reshape(-1, 1)
            img = fluid.dygraph.to_variable(x_data)
            label = fluid.dygraph.to_variable(y_data)

            logits = model(img)
            loss = fluid.layers.softmax_with_cross_entropy(logits, label)
            avg_loss = fluid.layers.mean(loss)
            if batch_id % 1000 == 0:
                print("epoch: {}, bath_id: {}, loss is: {}".format(
                    epoch, batch_id, avg_loss.numpy()))
            avg_loss.backward()
            opt.minimize(avg_loss)
            model.clear_gradients()
        model.eval()
        accuracies = []
        losses = []
        for batch_id, data in enumerate(valid_loader()):
            x_data = np.array([item[0] for item in data],
                              dtype='float32').reshape(-1, 1, 28, 28)
            y_data = np.array([item[1] for item in data],
                              dtype='int64').reshape(-1, 1)
            img = fluid.dygraph.to_variable(x_data)
            label = fluid.dygraph.to_variable(y_data)
            logits = model(img)
            pred = fluid.layers.softmax(logits)
            loss = fluid.layers.softmax_with_cross_entropy(logits, label)
            acc = fluid.layers.accuracy(pred, label)
            accuracies.append(acc.numpy())
            losses.append(loss.numpy())
        print("[validation accuracy/loss: {}/{}]".format(
            np.mean(accuracies), np.mean(losses)))
        model.train()
    fluid.save_dygraph(model.state_dict(), './result/hwdrByLeNet')


if __name__ == '__main__':
    with fluid.dygraph.guard():
        model = LeNet('LeNet', num_classes=10)
        train(model)

在這裏插入圖片描述

結果

start training...
epoch: 0, bath_id: 0, loss is: [2.2495162]
epoch: 0, bath_id: 1000, loss is: [2.2928371]
epoch: 0, bath_id: 2000, loss is: [2.3267434]
epoch: 0, bath_id: 3000, loss is: [2.2698295]
epoch: 0, bath_id: 4000, loss is: [2.2489858]
epoch: 0, bath_id: 5000, loss is: [2.312758]
[validation accuracy/loss: 0.45590001344680786/2.215536117553711]
epoch: 1, bath_id: 0, loss is: [2.1956322]
epoch: 1, bath_id: 1000, loss is: [2.063491]
epoch: 1, bath_id: 2000, loss is: [1.9574039]
epoch: 1, bath_id: 3000, loss is: [1.420162]
epoch: 1, bath_id: 4000, loss is: [0.98229045]
epoch: 1, bath_id: 5000, loss is: [1.2404814]
[validation accuracy/loss: 0.776199996471405/0.8473402261734009]
epoch: 2, bath_id: 0, loss is: [0.62948656]
epoch: 2, bath_id: 1000, loss is: [0.49548474]
epoch: 2, bath_id: 2000, loss is: [0.5145985]
epoch: 2, bath_id: 3000, loss is: [0.2760195]
epoch: 2, bath_id: 4000, loss is: [0.36493483]
epoch: 2, bath_id: 5000, loss is: [0.5631878]
[validation accuracy/loss: 0.8793999552726746/0.4475659728050232]
epoch: 3, bath_id: 0, loss is: [0.30772734]
epoch: 3, bath_id: 1000, loss is: [0.2511763]
epoch: 3, bath_id: 2000, loss is: [0.32035473]
epoch: 3, bath_id: 3000, loss is: [0.12164386]
epoch: 3, bath_id: 4000, loss is: [0.20446599]
epoch: 3, bath_id: 5000, loss is: [0.27960077]
[validation accuracy/loss: 0.9111999869346619/0.3133259415626526]
epoch: 4, bath_id: 0, loss is: [0.16361086]
epoch: 4, bath_id: 1000, loss is: [0.15575354]
epoch: 4, bath_id: 2000, loss is: [0.24734934]
epoch: 4, bath_id: 3000, loss is: [0.07145926]
epoch: 4, bath_id: 4000, loss is: [0.14044744]
epoch: 4, bath_id: 5000, loss is: [0.16796467]
[validation accuracy/loss: 0.9281999468803406/0.24646225571632385]

總結

通過最後的結果,我們可以看到手寫數字識別的準去率達到了92.8%
在這裏插入圖片描述

蒟蒻寫博客不易,加之本人水平有限,寫作倉促,錯誤和不足之處在所難免,謹請讀者和各位大佬們批評指正。
如需轉載,請署名作者並附上原文鏈接,蒟蒻非常感激
名稱:虐貓人薛定諤i
博客地址:https://blog.csdn.net/Deep___Learning

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章