【代碼學習】利用VGG13對CIFAR10分類

原創

2020-07-03 07:01

'''1.加載數據集'''
(x,y), (x_test, y_test) = datasets.cifar10.load_data()	#下載cifar10數據集，自動下載，返回ndarray
y = tf.squeeze(y, axis=1)								#返回一個tensor，這個tensor的每個維度都不爲1
y_test = tf.squeeze(y_test, axis=1)
print(x.shape, y.shape, x_test.shape, y_test.shape)		
#x是(50000,32,32,3)表示5w張32x32x3的圖像，y是(50000,)x_test是(10000,32,32,3)y_test是(10000,)，
'''訓練集'''
train_db = tf.data.Dataset.from_tensor_slices((x,y))
train_db = train_db.shuffle(1000).map(preprocess).batch(128)#隨機打散、預處理、批處理
'''測試集'''
test_db = tf.data.Dataset.from_tensor_slices((x_test,y_test))
test_db = test_db.map(preprocess).batch(64)

sample = next(iter(train_db))	 #從訓練集中採樣一個 Batch，並觀察 
print('sample:', sample[0].shape, sample[1].shape,
      tf.reduce_min(sample[0]), tf.reduce_max(sample[0]))

在這個代碼中，from_tensor_slices的作用是把給定的元組、列表和張量等數據進行特徵切片，將圖片x和標籤y轉換爲數據集對象，train_db的shape和type如下圖，另外，參數的數據集對象，一般都要經過一些數據處理，例如預處理、隨機打散、按批裝載等，接下來會提到這些函數

train_db.shuffle(1000)的作用是隨機打散，防止按順序的產生，出現記憶
map(preprocess)是一個映射函數，也就是將上一步的結果（隨機打散後的數據集）進行preprocess處理，具體方法由自己實現，這裏使用下列代碼，也就是歸一化操作

def preprocess(x, y):
    # [0~1]
    x = 2*tf.cast(x, dtype=tf.float32) / 255.-1
    y = tf.cast(y, dtype=tf.int32)
    return x,y

batch(128)是設之數據集爲批訓練模式，也就是每128個數據喂入網絡進行訓練，這個大小得按照自己處理器的性能來決定。當然，batch_size的選擇是有要求的，越大，訓練時間越大，內存容量也受不了，或會導致梯度下降方向不再改變；越小，不收斂，精度低。因此得適中。

2.構建網絡模型

這裏使用VGG13作爲網絡模型，如下圖所示。

首先是卷積網絡部分，包含有5個單元，每個單元都是2個卷積層+1個池化層：

conv_layers = [ # 5 units of conv + max pooling
    # unit 1
    layers.Conv2D(64, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.Conv2D(64, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),

    # unit 2
    layers.Conv2D(128, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.Conv2D(128, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),

    # unit 3
    layers.Conv2D(256, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.Conv2D(256, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),

    # unit 4
    layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'),

    # unit 5
    layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.Conv2D(512, kernel_size=[3, 3], padding="same", activation=tf.nn.relu),
    layers.MaxPool2D(pool_size=[2, 2], strides=2, padding='same'
]
conv_net = Sequential(conv_layers)

然後是全連接網絡部分，3個全連接層，使用的ReLU作爲激活函數：

fc_net = Sequential([
        layers.Dense(256, activation=tf.nn.relu),
        layers.Dense(128, activation=tf.nn.relu),
        layers.Dense(10, activation=None),
    ])

然後是對其進行build，利用summary函數就可以打印網絡參數

conv_net.build(input_shape=[None, 32, 32, 3])
fc_net.build(input_shape=[None, 512])
conv_net.summary()
fc_net.summary()

3.訓練網絡

optimizer = optimizers.Adam(lr=1e-4)

# [1, 2] + [3, 4] => [1, 2, 3, 4]
variables = conv_net.trainable_variables + fc_net.trainable_variables

for epoch in range(50):

    for step, (x,y) in enumerate(train_db):

        with tf.GradientTape() as tape:
            # [b, 32, 32, 3] => [b, 1, 1, 512]
            out = conv_net(x)
            # flatten, => [b, 512]
            out = tf.reshape(out, [-1, 512])
            # [b, 512] => [b, 10]
            logits = fc_net(out)
            # [b] => [b, 10]
            y_onehot = tf.one_hot(y, depth=10)
            # compute loss
            # loss = tf.losses.categorical_crossentropy(y_onehot, logits, from_logits=True)#tf2.0
            loss = tf.keras.backend.categorical_crossentropy(y_onehot, logits, from_logits=True)
            loss = tf.reduce_mean(loss)

        grads = tape.gradient(loss, variables)
        optimizer.apply_gradients(zip(grads, variables))

        if step %100 == 0:
            # print(epoch,step,'loss',float(loss))      # tf2.0
            print(epoch, step, 'loss:', loss)

    total_num = 0
    total_correct = 0
    for x,y in test_db:

        out = conv_net(x)
        out = tf.reshape(out, [-1, 512])
        logits = fc_net(out)
        prob = tf.nn.softmax(logits, axis=1)
        pred = tf.argmax(prob, axis=1)	#得到得分值最大那個的序號
        pred = tf.cast(pred, dtype=tf.int32)

        correct = tf.cast(tf.equal(pred, y), dtype=tf.int32)
        correct = tf.reduce_sum(correct)

        total_num += x.shape[0]
        total_correct += int(correct)

    acc = total_correct / total_num
    print(epoch, 'acc:', acc)

optimizer選用Adam，參數1e-4，優化器是用來輔助梯度下降的。
variables = conv_net.trainable_variables + fc_net.trainable_variables表示將兩個網絡給合併在一起，最後形成VGG13
grads = tape.gradient(loss, variables);optimizer.apply_gradients(zip(grads, variables))中，loss選用交叉熵損失函數，對網絡進行梯度求解，應用到apply_gradients方法，進行梯度下降。
one-hot()是對標籤進行one-hot編碼，比如y分爲0~9，10個分類，那麼類別4，0001000000。目的是爲了方便進行損失計算。
acc的計算方式就理解起來就比較容易了，對測試集中每一個數據，進行計算，放到網絡中，出來的結果是啥，跟真實標籤匹配不，匹配就加1，最後得到匹配結果的數目的比例

參考代碼

https://github.com/dragen1860/Deep-Learning-with-TensorFlow-book/tree/master/ch10

【代碼學習】利用VGG13對CIFAR10分類

目錄

背景

CIFAR10數據集

代碼學習及分析

1.加載cifer10數據

2.構建網絡模型

3.訓練網絡

參考代碼

相關參考資料

Opencv常用函數合集【持續更新】

【pycharm】自動生成代碼模板腳本（一步搞定）

《機器學習實戰》第八章【預測數值型數據：（線性）迴歸】

《機器學習實戰》第九章【樹迴歸】

《機器學習實戰》第十三章【利用PCA來簡化數據】

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結