經過一系列的計算機視覺的公開課視頻的學習,迴歸到本課的最初應用,就是實現對於十類圖像的圖像分類。之前的Assignment 已經通過knn和svm做過基礎的分析。這裏我們通過卷積神經網絡來實現。首先我們下載cs231n提供的訓練集和測試集。
下載地址:http://www.cs.toronto.edu/~kriz/cifar.html
對於訓練集一共是十類,每一類1000張,像素值爲32*32的圖像。而測試集是十類,每一類5000張同等大小的圖像。爲了最後判斷圖像分類的準確性,已經在圖像的命名中由0~9,分別標註十類圖像。
項目結構如下(無拓展名的爲文件夾):
>assignment
>>data(測試集和訓練集的圖像)
. >>>test
>>>train
>>c.py (訓練文件)
>>model (存放訓練後的模型)
train-test.py
import os
from PIL import Image
import numpy as np
import tensorflow as tf
test = True
test_dir = "./data/test"
model_path = "./model"
# 從文件夾讀取圖片和標籤到numpy數組
def read_data(test_dir):
datas = []
labels = []
fpaths = []
for fname in os.listdir(test_dir):
fpath = os.path.join(test_dir, fname)
fpaths.append(fpath)
image = Image.open(fpath)
data = np.array(image) / 255.0
label = int(fname.split("_")[0])
datas.append(data)
labels.append(label)
datas = np.array(datas)
labels = np.array(labels)
print("shape of datas: {}\tshape of labels: {}".format(datas.shape, labels.shape))
return fpaths, datas, labels
fpaths, datas, labels = read_data(test_dir)
num_classes = len(set(labels))
# 存放輸入和標籤
datas_placeholder = tf.placeholder(tf.float32, [None, 32, 32, 3])
labels_placeholder = tf.placeholder(tf.int32, [None])
# 存放DropOut
dropout_placeholdr = tf.placeholder(tf.float32)
# 卷積層
conv0 = tf.layers.conv2d(datas_placeholder, 20, 5, activation=tf.nn.relu)
# 池化
pool0 = tf.layers.max_pooling2d(conv0, [2, 2], [2, 2])
# 卷積層
conv1 = tf.layers.conv2d(pool0, 40, 4, activation=tf.nn.relu)
# 池化
pool1 = tf.layers.max_pooling2d(conv1, [2, 2], [2, 2])
# 將3維特徵轉換爲1維向量
flatten = tf.layers.flatten(pool1)
# 全連接層
fc = tf.layers.dense(flatten, 400, activation=tf.nn.relu)
# DropOut層
dropout_fc = tf.layers.dropout(fc, dropout_placeholdr)
# 輸出層
logits = tf.layers.dense(dropout_fc, num_classes)
predicted_labels = tf.arg_max(logits, 1)
# 交叉定義損失
losses = tf.nn.softmax_cross_entropy_with_logits(
labels=tf.one_hot(labels_placeholder, num_classes),
logits=logits
)
# 平均損失
mean_loss = tf.reduce_mean(losses)
# 定義優化器,指定要優化的損失函數
optimizer = tf.train.AdamOptimizer(learning_rate=1e-2).minimize(losses)
# 用於保存和載入模型
saver = tf.train.Saver()
with tf.Session() as sess:
if test:
print("測試")
saver.restore(sess, model_path)
print("從{}載入模型".format(model_path))
# label和名稱的對照關係
label_name_dict = {
0:"飛機",
1:"汽車",
2:"鳥",
3:"貓",
4:"鹿",
5:"狗",
6:"青蛙",
7:"馬",
8:"船",
9:"卡車"
}
# 定義輸入和Label
test_feed_dict = {
datas_placeholder: datas,
labels_placeholder: labels,
dropout_placeholdr: 0
}
predicted_labels_val = sess.run(predicted_labels, feed_dict=test_feed_dict)
# 真實label與模型預測label
for fpath, real_label, predicted_label in zip(fpaths, labels, predicted_labels_val):
# 將label id轉換爲label名
real_label_name = label_name_dict[real_label]
predicted_label_name = label_name_dict[predicted_label]
print("{}\t{} => {}".format(fpath, real_label_name, predicted_label_name))