MXNET深度學習框架-17-使用gluon實現AlexNet

      Alexnet可以說是使深度學習大火的深度模型。它在2012年被Hinton等人提出,該模型憑藉一個8層卷積神經網絡而贏得了ImageNet的圖像識別挑戰,這個模型與經典的LeNet-5有點類似。

      這個模型有一些顯著的特性:
1)網絡層數比LeNet-5深,包含5層卷積和3層全連接。
2)第一層卷積核大小爲11×1111×11,第二層爲5×55×5,之後均爲3×33×3,此外,第一、二和五層卷積層之後都跟隨這池化核大小爲3×33×3,步長爲2的池化層。
      具體的理論部分可以去閱讀原論文:ImageNet Classification with Deep Convolutional Neural Networks

下面是AlexNet的相關實現代碼:

import mxnet.gluon as gn
import mxnet.image as im
import mxnet.autograd as ag
import mxnet.ndarray as nd
import mxnet.initializer as init

'''---定義模型---'''
# AlexNet中的LRN其實沒有太多用,反而會增加計算時間,所以刪去
net=gn.nn.Sequential()
with net.name_scope():
    # 第一階段
    net.add(gn.nn.Conv2D(channels=96,kernel_size=11,strides=(4,4),activation="relu"))
    net.add(gn.nn.MaxPool2D(pool_size=(3,3),strides=2))
    # 第二階段
    net.add(gn.nn.Conv2D(channels=256, kernel_size=5, strides=(1, 1),padding=2,activation="relu"))
    net.add(gn.nn.MaxPool2D(pool_size=(3, 3), strides=2))
    # 第三階段
    net.add(gn.nn.Conv2D(channels=384, kernel_size=3, strides=(1, 1), padding=1, activation="relu"))
    net.add(gn.nn.Conv2D(channels=384, kernel_size=3, strides=(1, 1), padding=1, activation="relu"))
    net.add(gn.nn.Conv2D(channels=256, kernel_size=3, strides=(1, 1), padding=1, activation="relu"))
    net.add(gn.nn.MaxPool2D(pool_size=(3, 3), strides=2))
    # 第四階段
    net.add(gn.nn.Flatten())
    net.add(gn.nn.Dense(4096,activation="relu"))
    net.add(gn.nn.Dropout(0.5))
    # 第五階段
    net.add(gn.nn.Dense(4096, activation="relu"))
    net.add(gn.nn.Dropout(0.5))
    # 第六階段
    net.add(gn.nn.Dense(10))  # 真實AlexNet的輸出其實是1000

'''---讀取數據和預處理---'''
def load_data_fashion_mnist(batch_size, resize=None):
    transformer = []
    if resize:
        transformer += [gn.data.vision.transforms.Resize(resize)]
    transformer += [gn.data.vision.transforms.ToTensor()]
    transformer = gn.data.vision.transforms.Compose(transformer)
    mnist_train = gn.data.vision.FashionMNIST( train=True)
    mnist_test = gn.data.vision.FashionMNIST( train=False)
    train_iter = gn.data.DataLoader(
        mnist_train.transform_first(transformer), batch_size, shuffle=True)
    test_iter = gn.data.DataLoader(
        mnist_test.transform_first(transformer), batch_size, shuffle=False)
    return train_iter, test_iter

batch_size=32
train_iter,test_iter=load_data_fashion_mnist(batch_size,resize=224)
net.initialize(init=init.Xavier()) # 隨機初始化

# softmax和交叉熵損失函數
# 由於將它們分開會導致數值不穩定(前兩章博文的結果可以對比),所以直接使用gluon提供的API
cross_loss=gn.loss.SoftmaxCrossEntropyLoss()

# 定義準確率
def accuracy(output,label):
    return nd.mean(output.argmax(axis=1)==label).asscalar()

def evaluate_accuracy(data_iter,net):# 定義測試集準確率
    acc=0
    for data,label in data_iter:
        label = label.astype('float32')
        output=net(data)
        acc+=accuracy(output,label)
    return acc/len(data_iter)

# softmax和交叉熵分開的話數值可能會不穩定
cross_loss=gn.loss.SoftmaxCrossEntropyLoss()
# 優化
train_step=gn.Trainer(net.collect_params(),'sgd',{"learning_rate":0.01})

# 訓練
lr=0.1
epochs=20
for epoch in range(epochs):
    train_loss=0
    train_acc=0
    for image,y in train_iter:
        y = y.astype('float32')
        with ag.record():
            output = net(image)
            loss = cross_loss(output, y)
        loss.backward()
        train_step.step(batch_size)
        train_loss += nd.mean(loss).asscalar()
        train_acc += accuracy(output, y)
    test_acc = evaluate_accuracy(test_iter, net)
    print("Epoch %d, Loss:%f, Train acc:%f, Test acc:%f"
          %(epoch,train_loss/len(train_iter),train_acc/len(train_iter),test_acc))

受訓練設備的影響,本文不附上訓練過程(訓練時間太長…)

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章