基於SVM的cifar10分類

備註:閱讀博客後的筆記,代碼來自他人博客。


1. 基於線性SVM的cifar10圖像分類

博客爲:svm實現圖片分類(python) 博客對應的代碼倉庫:https://github.com/452896915/cs231n_course_homework

1.1 cifar10數據集的構成:http://www.cs.toronto.edu/~kriz/cifar.html

數據集訓練集有5個batch: 每個batch爲10k數據。測試集有一個batch,10k數據。帶有標籤。分類的圖像都是32×32×3。

1.2 當前的result

cifar10目前的測試集準確率到了什麼水平呢?

參看網站排名:http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html#43494641522d3130

1.3 基於線性SVM的圖片分類

https://blog.csdn.net/red_stone1/article/details/80661133 更好的線性SVM的博客

博客爲:svm實現圖片分類(python) 博客對應的代碼倉庫:https://github.com/452896915/cs231n_course_homework

  • 方法原理:將32×32×3的圖像直接作爲SVM的輸入,輸入特徵就有:3072維度,每個Pixel作爲一個特徵。即不進行手工特徵提取。
  • 線性SVM分類器爲博主自己寫的,重點理解了Hinge Loss和gradient梯度推導進一步理解SVM多分類的原理
  • 小批量200的隨機梯度下降。
  • 數據在送入SVM之前,都減去了training的50k個樣本的均值,即減去了所有樣本的平均值
  • 對於分類問題的訓練誤差討論的指標都是:accuracy=0.35,準確率,而不是訓練誤差。
  • 特徵3072維,訓練樣本數50k,基本上不存在過擬合。同時訓練誤差與測試誤差基本上一致,都高於human誤差,所以結果是模型欠擬合。
  • 可以根據樣本的規模畫出學習曲線,accuracy隨着樣本規模的變化規律!
  • 所以,現在基本上是欠擬合的狀態。所以增大樣本基本上不再有變化了。
    • 需要更多的特徵
    • 嘗試更復雜的模型
    • 減小正則化

    •  

1.4 基於HOG+SVM的cifar10的圖像分類

python實現HOG+SVM對CIFAR-10數據集分類

比1.3的直接將32*32*3的圖像扔進SVM相比,這個先提取HOG特徵,再進入SVM

  • 3通道彩色圖->單通道灰度圖,得到灰度圖的HOG
  • HOG特徵維度爲288,最後一維爲分類的標籤。
  • 32*32的HOG提取維度應該爲36*9=324維度
  • 訓練集的準確率才0.50,測試集的準確率爲0.49.
    • 所以泛化誤差基本不存在的,模型仍然是欠擬合。需要更復雜的特徵、更復雜的網絡等。

(1)之前的特徵爲3720維度,但是是原始的pixel作爲特徵,可以看出仍然欠擬合,所以是特徵不夠好!

(2)現在的特徵維度爲288,但是在訓練集的準確率增加到0.5,之前爲0.36。更少的維度獲取到更高的準確率

(3)兩個模型下的泛化誤差基本等於訓練誤差,所以仍然是模型欠擬合的問題。

  • 欠擬合問題需要更復雜的模型、更好的特徵表示、正則化不要太強。

(4)288維度的HOG特徵相比於3720的Pixel特徵,準確率卻提升了14%,所以特徵相當重要,但是這樣的分類效率對於Human error來說,還是不夠,所以需要神經網絡。

 

稍微更改了讀取數據的那一部分代碼:

import os
import cv2
import math
import time
import numpy as np
import tqdm
from skimage.feature import hog
from sklearn.svm import LinearSVC


class Classifier(object):
    def __init__(self, filePath):
        self.filePath = filePath

    def unpickle(self, file):
        import pickle
        with open(file, 'rb') as fo:
            dict = pickle.load(fo, encoding='bytes')
        return dict

    def get_data(self):
        TrainData = []
        TestData = []
        for b in range(1,6):
            f = os.path.join(self.filePath, 'data_batch_%d' % (b, ))
            data = self.unpickle(f)
            train = np.reshape(data[b'data'], (10000, 3, 32 * 32))
            labels = np.reshape(data[b'labels'], (10000, 1))
            fileNames = np.reshape(data[b'filenames'], (10000, 1))
            datalebels = zip(train, labels, fileNames)
            TrainData.extend(datalebels)
        f = os.path.join(self.filePath,'test_batch')
        data = self.unpickle(f)
        test = np.reshape(data[b'data'], (10000, 3, 32 * 32))
        labels = np.reshape(data[b'labels'], (10000, 1))
        fileNames = np.reshape(data[b'filenames'], (10000, 1))
        TestData.extend(zip(test, labels, fileNames))

        '''
        for childDir in os.listdir(self.filePath):
            if 'data_batch' in childDir:
                f = os.path.join(self.filePath, childDir)
                data = self.unpickle(f)
                # train = np.reshape(data[str.encode('data')], (10000, 3, 32 * 32))
                # If your python version do not support to use this way to transport str to bytes.
                # Think another way and you can.
                train = np.reshape(data[b'data'], (10000, 3, 32 * 32))
                labels = np.reshape(data[b'labels'], (10000, 1))
                fileNames = np.reshape(data[b'filenames'], (10000, 1))
                datalebels = zip(train, labels, fileNames)
                TrainData.extend(datalebels)
            if childDir == "test_batch":
                f = os.path.join(self.filePath, childDir)
                data = self.unpickle(f)
                test = np.reshape(data[b'data'], (10000, 3, 32 * 32))
                labels = np.reshape(data[b'labels'], (10000, 1))
                fileNames = np.reshape(data[b'filenames'], (10000, 1))
                TestData.extend(zip(test, labels, fileNames))
        '''
        print("data read finished!")
        return TrainData, TestData

    def get_hog_feat(self, image, stride=8, orientations=8, pixels_per_cell=(8, 8), cells_per_block=(2, 2)):
        cx, cy = pixels_per_cell
        bx, by = cells_per_block
        sx, sy = image.shape
        n_cellsx = int(np.floor(sx // cx))  # number of cells in x
        n_cellsy = int(np.floor(sy // cy))  # number of cells in y
        n_blocksx = (n_cellsx - bx) + 1
        n_blocksy = (n_cellsy - by) + 1
        gx = np.zeros((sx, sy), dtype=np.float32)
        gy = np.zeros((sx, sy), dtype=np.float32)
        eps = 1e-5
        grad = np.zeros((sx, sy, 2), dtype=np.float32)
        for i in range(1, sx-1):
            for j in range(1, sy-1):
                gx[i, j] = image[i, j-1] - image[i, j+1]
                gy[i, j] = image[i+1, j] - image[i-1, j]
                grad[i, j, 0] = np.arctan(gy[i, j] / (gx[i, j] + eps)) * 180 / math.pi
                if gx[i, j] < 0:
                    grad[i, j, 0] += 180
                grad[i, j, 0] = (grad[i, j, 0] + 360) % 360
                grad[i, j, 1] = np.sqrt(gy[i, j] ** 2 + gx[i, j] ** 2)
        normalised_blocks = np.zeros((n_blocksy, n_blocksx, by * bx * orientations))
        for y in range(n_blocksy):
            for x in range(n_blocksx):
                block = grad[y*stride:y*stride+16, x*stride:x*stride+16]
                hist_block = np.zeros(32, dtype=np.float32)
                eps = 1e-5
                for k in range(by):
                    for m in range(bx):
                        cell = block[k*8:(k+1)*8, m*8:(m+1)*8]
                        hist_cell = np.zeros(8, dtype=np.float32)
                        for i in range(cy):
                            for j in range(cx):
                                n = int(cell[i, j, 0] / 45)
                                hist_cell[n] += cell[i, j, 1]
                        hist_block[(k * bx + m) * orientations:(k * bx + m + 1) * orientations] = hist_cell[:]
                normalised_blocks[y, x, :] = hist_block / np.sqrt(hist_block.sum() ** 2 + eps)
        return normalised_blocks.ravel()

    def get_feat(self, TrainData, TestData):
        train_feat = []
        test_feat = []
        for data in tqdm.tqdm(TestData):
            image = np.reshape(data[0].T, (32, 32, 3))
            gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)/255.
            fd = self.get_hog_feat(gray) #你可以用我寫的hog提取函數,也可以用下面skimage提供的,我的速度會慢一些
            # fd = hog(gray, 9, [8, 8], [2, 2])
            fd = np.concatenate((fd, data[1]))
            test_feat.append(fd)
        test_feat = np.array(test_feat)
        np.save("test_feat.npy", test_feat)
        print("Test features are extracted and saved.")
        for data in tqdm.tqdm(TrainData):
            image = np.reshape(data[0].T, (32, 32, 3))
            gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) / 255.
            fd = self.get_hog_feat(gray)
            # fd = hog(gray, 9, [8, 8], [2, 2])
            fd = np.concatenate((fd, data[1]))
            train_feat.append(fd)
        train_feat = np.array(train_feat)
        np.save("train_feat.npy", train_feat)
        print("Train features are extracted and saved.")
        return train_feat, test_feat

    def classification(self, train_feat, test_feat):
        t0 = time.time()
        clf = LinearSVC()
        print("Training a Linear SVM Classifier.")
        clf.fit(train_feat[:, :-1], train_feat[:, -1])
        predict_result = clf.predict(test_feat[:, :-1])
        num = 0
        for i in range(len(predict_result)):
            if int(predict_result[i]) == int(test_feat[i, -1]):
                num += 1
        rate = float(num) / len(predict_result)
        t1 = time.time()
        print('The testing classification accuracy is %f' % rate)
        print('The testing cast of time is :%f' % (t1 - t0))

        predict_result2 = clf.predict(train_feat[:, :-1])
        num2 = 0
        for i in range(len(predict_result2)):
            if int(predict_result2[i]) == int(train_feat[i, -1]):
                num2 += 1
        rate2 = float(num2) / len(predict_result2)
        print('The Training classification accuracy is %f' % rate2)

    def run(self):
        if os.path.exists("train_feat.npy") and os.path.exists("test_feat.npy"):
            train_feat = np.load("train_feat.npy")
            test_feat = np.load("test_feat.npy")
        else:
            TrainData, TestData = self.get_data()
            train_feat, test_feat = self.get_feat(TrainData, TestData)
        self.classification(train_feat, test_feat)


if __name__ == '__main__':
    #filePath = r'F:\DataSets\cifar-10-batches-py'
    filePath = r'.\datasets'
    cf = Classifier(filePath)
    cf.run()

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章