計算機視覺學習——李飛飛斯坦福大學計算機視覺公開課cs231n 圖像分類 實現

經過一系列的計算機視覺的公開課視頻的學習,迴歸到本課的最初應用,就是實現對於十類圖像的圖像分類。之前的Assignment 已經通過knn和svm做過基礎的分析。這裏我們通過卷積神經網絡來實現。首先我們下載cs231n提供的訓練集和測試集。

下載地址:http://www.cs.toronto.edu/~kriz/cifar.html

對於訓練集一共是十類,每一類1000張,像素值爲32*32的圖像。而測試集是十類,每一類5000張同等大小的圖像。爲了最後判斷圖像分類的準確性,已經在圖像的命名中由0~9,分別標註十類圖像。

項目結構如下(無拓展名的爲文件夾):

>assignment   

       >>data(測試集和訓練集的圖像)

             .  >>>test

                >>>train

       >>c.py (訓練文件)

       >>model  (存放訓練後的模型)

 

 

train-test.py

import os
from PIL import Image
import numpy as np
import tensorflow as tf

test = True

test_dir = "./data/test"
model_path = "./model"



# 從文件夾讀取圖片和標籤到numpy數組
def read_data(test_dir):
    datas = []
    labels = []
    fpaths = []
    for fname in os.listdir(test_dir):
        fpath = os.path.join(test_dir, fname)
        fpaths.append(fpath)
        image = Image.open(fpath)
        data = np.array(image) / 255.0
        label = int(fname.split("_")[0])
        datas.append(data)
        labels.append(label)

    datas = np.array(datas)
    labels = np.array(labels)

    print("shape of datas: {}\tshape of labels: {}".format(datas.shape, labels.shape))
    return fpaths, datas, labels


fpaths, datas, labels = read_data(test_dir)
num_classes = len(set(labels))

# 存放輸入和標籤
datas_placeholder = tf.placeholder(tf.float32, [None, 32, 32, 3])
labels_placeholder = tf.placeholder(tf.int32, [None])

# 存放DropOut
dropout_placeholdr = tf.placeholder(tf.float32)

# 卷積層
conv0 = tf.layers.conv2d(datas_placeholder, 20, 5, activation=tf.nn.relu)

# 池化
pool0 = tf.layers.max_pooling2d(conv0, [2, 2], [2, 2])

# 卷積層
conv1 = tf.layers.conv2d(pool0, 40, 4, activation=tf.nn.relu)

# 池化
pool1 = tf.layers.max_pooling2d(conv1, [2, 2], [2, 2])

# 將3維特徵轉換爲1維向量
flatten = tf.layers.flatten(pool1)

# 全連接層
fc = tf.layers.dense(flatten, 400, activation=tf.nn.relu)

# DropOut層
dropout_fc = tf.layers.dropout(fc, dropout_placeholdr)

# 輸出層
logits = tf.layers.dense(dropout_fc, num_classes)

predicted_labels = tf.arg_max(logits, 1)


# 交叉定義損失
losses = tf.nn.softmax_cross_entropy_with_logits(
    labels=tf.one_hot(labels_placeholder, num_classes),
    logits=logits
)
# 平均損失
mean_loss = tf.reduce_mean(losses)

# 定義優化器,指定要優化的損失函數
optimizer = tf.train.AdamOptimizer(learning_rate=1e-2).minimize(losses)

# 用於保存和載入模型
saver = tf.train.Saver()





with tf.Session() as sess:
    if test:
        print("測試")
   
        saver.restore(sess, model_path)
        print("從{}載入模型".format(model_path))
        # label和名稱的對照關係
        label_name_dict = {
            0:"飛機",
            1:"汽車",
            2:"鳥",
            3:"貓",
            4:"鹿",
            5:"狗",
            6:"青蛙",
            7:"馬",
            8:"船",
            9:"卡車"
        }

        # 定義輸入和Label
        test_feed_dict = {
            datas_placeholder: datas,
            labels_placeholder: labels,
            dropout_placeholdr: 0
        }
        predicted_labels_val = sess.run(predicted_labels, feed_dict=test_feed_dict)

        # 真實label與模型預測label
        for fpath, real_label, predicted_label in zip(fpaths, labels, predicted_labels_val):
            # 將label id轉換爲label名
            real_label_name = label_name_dict[real_label]
            predicted_label_name = label_name_dict[predicted_label]
            print("{}\t{} => {}".format(fpath, real_label_name, predicted_label_name))


 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章