動手學無人駕駛(1):交通標誌識別

今天主要介紹無人駕駛當中深度學習技術的應用。

本文是根據博客專家AdamShan的文章整理而來,在此表示感謝。關於深度學習的圖像分類技術,網上已有很多關於深度學習的課程(如吳恩達老師的深度學習專項課程),故本文不對理論部分進行闡述,只關注實踐部分,用到的深度學習框架是keras。關於keras框架的學習,之前轉載了一篇黃海廣博士的文章,是很不錯的學習書籍。

首發:深度學習入門寶典-《python深度學習》原文代碼中文註釋版及電子書

目錄

1.導入庫

2. 數據處理(重點)

2.1 讀取數據集

2.2 顯示圖片

2.3 圖片尺寸處理

2.4 將圖片轉換爲numpy 數組

2.5 將圖像轉爲灰度圖

2.6 數據集增強

3. 訓練神經網絡

4. 測試驗證

5. 使用模型

6. 總結


1.導入庫

在代碼編寫時,首先是加載本文中所需要的庫,如下:

import os 
import random 
import scipy  
import skimage 
import matplotlib 

import matplotlib.pyplot as plt
import numpy as np
from scipy import ndimage
from skimage import color

import keras
from keras.models import Sequential
from keras.layers import Dense, Flatten
from keras.optimizers import SGD
from keras.layers import Dropout
from keras.optimizers import RMSprop

# Allow image embeding in notebook
%matplotlib inline

注意:在Jupyter Notebook運行上述代碼時scipy庫和skimage庫會衝突,解決辦法是通過pip包安裝scipy,如下:

sudo pip3 install scipy

2. 數據處理(重點)

本文中用到的交通標誌數據集是BelgiumTS數據集,將數據集解壓位於notebook同一目錄下,

  • data/Training/
  • data/Testing/

每個文件夾包含了62個類,編號從 00000 到 00061.

2.1 讀取數據集

def load_data(data_dir):
    """Loads a data set and returns two lists:
    (加載數據,返回圖像列表和標籤列表)
    images: a list of Numpy arrays, each representing an image.
    labels: a list of numbers that represent the images labels.
    """
    # Get all subdirectories of data_dir. Each represents a label.
    #獲取data_dir所有子文件夾,每一個包含一種標籤
    #os.listdir(絕對路徑),返回文件名和目錄名組成的列表
    #os.path.isdir(),發【判斷是否爲目錄
    #os.path.join(),將多個路徑組合爲一個路徑
   
    directories = [d for d in os.listdir(data_dir) 
                   if os.path.isdir(os.path.join(data_dir, d))]
    
    #print(directories)
    
    # Loop through the label directories and collect the data in(循環遍歷每一個子目錄,收集數據)
    # two lists, labels and images.
    labels = []
    images = []
    for d in directories:
        label_dir = os.path.join(data_dir, d)
        file_names = [os.path.join(label_dir, f) 
                      for f in os.listdir(label_dir) if f.endswith(".ppm")]
        #print(file_names)
        
        # For each label, load it's images and add them to the images list.
        # And add the label number (i.e. directory name) to the labels list.
        for f in file_names:
            images.append(skimage.data.imread(f))
            labels.append(int(d))
    return images, labels
   


# Load training and testing datasets.
ROOT_PATH = "data"
train_data_dir = os.path.join(ROOT_PATH, "Training")
test_data_dir = os.path.join(ROOT_PATH, "Testing")

images, labels = load_data(train_data_dir)

2.2 顯示圖片

加載完數據集後,顯示每個交通標識類別的第一張圖片。

def display_images_and_labels(images, labels):
    """Display the first image of each label."""
    unique_labels = set(labels)
    plt.figure(figsize=(15, 15))
    i = 1
    for label in unique_labels:
        # Pick the first image for each label.
        image = images[labels.index(label)]
        plt.subplot(8, 8, i)  # A grid of 8 rows x 8 columns
        plt.axis('off')
        plt.title("Label {0} ({1})".format(label, labels.count(label)))
        i += 1
        _ = plt.imshow(image)
    plt.show()

display_images_and_labels(images, labels)

結果如下:


2.3 圖片尺寸處理

數據集圖片大小尺寸並不統一,爲了更好的訓練神經網絡,需要將所有圖片調整到一個相同的尺寸,這裏將圖片尺寸調整到(32,32)。

# Resize images(調整到(32,32))
images32 = [skimage.transform.resize(image, (32, 32))
                for image in images]
display_images_and_labels(images32, labels)

調整後的圖片如圖所示,此時圖片大小尺寸已經統一。

顯示調整後的圖片信息:

for image in images32[:5]:
    print("shape: {0}, min: {1}, max: {2}".format(image.shape, image.min(), image.max()))
shape: (32, 32, 3), min: 0.03529411764705882, max: 0.996078431372549
shape: (32, 32, 3), min: 0.03395373774509821, max: 0.996078431372549
shape: (32, 32, 3), min: 0.03694182751225482, max: 0.996078431372549
shape: (32, 32, 3), min: 0.06460056678921586, max: 0.9191425398284314
shape: (32, 32, 3), min: 0.060355392156862725, max: 0.9028492647058823

2.4 將圖片轉換爲numpy 數組

labels_a = np.array(labels)
images_a = np.array(images32)
print("labels: ", labels_a.shape, "\nimages: ", images_a.shape)

2.5 將圖像轉爲灰度圖

對圖像進行一定的預處理,在這裏將原來的RGB三通道的圖像轉換爲灰度圖。

images_a = color.rgb2gray(images_a)
display_images_and_labels(images_a, labels)


2.6 數據集增強

這裏將數據集擴充爲5倍,並顯示其中的幾類。

def expend_training_data(train_x, train_y):
    """
    Augment training data
    擴充5被
    """
    expanded_images = np.zeros([train_x.shape[0] * 5, train_x.shape[1], train_x.shape[2]])
    expanded_labels = np.zeros([train_x.shape[0] * 5])
    counter = 0
    
    for x, y in zip(train_x, train_y):
        # register original data(加載原始數據)
        expanded_images[counter, :, :] = x
        expanded_labels[counter] = y
        counter = counter + 1

        # get a value for the background
        # zero is the expected value, but median() is used to estimate background's value
        bg_value = np.median(x)  # this is regarded as background's value

        for i in range(4):
            # rotate the image with random degree
            angle = np.random.randint(-15, 15, 1)
            #new_img = ndimage.rotate(x, angle, reshape=False, cval=bg_value)

            # shift the image with random distance
            shift = np.random.randint(-2, 2, 2)
            new_img_ = ndimage.shift(x, shift, cval=bg_value)

            # register new training data
            expanded_images[counter, :, :] = new_img_
            expanded_labels[counter] = y
            counter = counter + 1

    return expanded_images, expanded_labels

images_a, labels_a = expend_training_data(images_a, labels_a)
print(images_a.shape, labels_a.shape)
labels = labels_a.tolist()
print(len(labels))
def plot_agument(images, labels):
    plt.figure(figsize=(16, 9))
    unique_labels = set(labels)
    i = 1
    for label in unique_labels:
        # Pick the first image for each label.
        if i > 3:
            break
        img_index = labels.index(label)
        for j in range(5):
            image = images_a[img_index+j]
            plt.subplot(3, 5, (i-1)*5 + j+1)  # A grid of 8 rows x 8 columns
            plt.axis('off')
            plt.title("Label {0} ({1})".format(label, labels.count(label)))
            _ = plt.imshow(image, cmap='gray')
        i += 1

plot_agument(images_a, labels)


3. 訓練神經網絡

這裏用到keras搭建神經網絡,模式爲Sequential模式。

from sklearn.utils import shuffle

indx = np.arange(0, len(labels_a))
indx = shuffle(indx)
images_a = images_a[indx]
labels_a = labels_a[indx]

#總樣本數爲22875
train_x, val_x = images_a[:20000], images_a[20000:]
train_y, val_y = labels_a[:20000], labels_a[20000:]

train_y = keras.utils.to_categorical(train_y, 62)
val_y = keras.utils.to_categorical(val_y, 62)

#構建神經網絡
model = Sequential()
model.add(Flatten(input_shape=(32, 32)))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(62, activation='softmax'))

model.summary()

model.compile(loss='categorical_crossentropy',
                  optimizer=RMSprop(),
                  metrics=['accuracy'])

history = model.fit(train_x, train_y,
                    batch_size=128,
                    epochs=20,
                    verbose=1,
                    validation_data=(val_x, val_y))

### print the keys contained in the history object
#print(history.history.keys())
model.save('model.json')

網絡的訓練誤差和驗證誤差如圖所示:

def plot_training(history):
    ### plot the training and validation loss for each epoch
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('model mean squared error loss')
    plt.ylabel('mean squared error loss')
    plt.xlabel('epoch')
    plt.legend(['training set', 'validation set'], loc='upper right')
    plt.show()

plot_training(history=history)

4. 測試驗證

加載測試集:

# Load the test dataset.
test_images, test_labels = load_data(test_data_dir)
# Transform the images, just like we did with the training set.
test_images32 = [skimage.transform.resize(image, (32, 32))
                 for image in test_images]

test_images_a = np.array(test_images32)
test_labels_a = np.array(test_labels)

test_images_a = color.rgb2gray(test_images_a)

display_images_and_labels(test_images_a, test_labels)

test_x = test_images_a
test_y = keras.utils.to_categorical(test_labels_a, 62)
score = model.evaluate(test_x, test_y, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

測試精度爲:88%。

Test loss: 0.49647544848156117
Test accuracy: 0.8789682388305664

5. 使用模型

這裏選取幾張圖片樣本,進行測試,測試結果如圖所示,其中有2張圖片測試結構與標籤不一致。

# Display the predictions and the ground truth visually.
fig = plt.figure(figsize=(10, 10))
j = 1
for i in range(0, 1000, 100):
    truth = test_labels_a[i]
    prediction = predicted[i]
    plt.subplot(5, 2, j)
    j = j+1 
    plt.axis('off')
    color='green' if truth == prediction else 'red'
    plt.text(40, 10, "Truth:        {0}\nPrediction: {1}".format(truth, prediction), 
             fontsize=12, color=color)
    plt.imshow(test_x[i], cmap='gray')


6. 總結

本文利用深度學習進行了交通標誌識別,首先進行了數據的處理,然後訓練了神經網絡,最後進行了測試,測試精度達到88%主要的工作是圖像數據處理。爲進一步提高精度,可考慮使用卷積神經網絡

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章