动手学无人驾驶(1):交通标志识别

今天主要介绍无人驾驶当中深度学习技术的应用。

本文是根据博客专家AdamShan的文章整理而来,在此表示感谢。关于深度学习的图像分类技术,网上已有很多关于深度学习的课程(如吴恩达老师的深度学习专项课程),故本文不对理论部分进行阐述,只关注实践部分,用到的深度学习框架是keras。关于keras框架的学习,之前转载了一篇黄海广博士的文章,是很不错的学习书籍。

首发:深度学习入门宝典-《python深度学习》原文代码中文注释版及电子书

目录

1.导入库

2. 数据处理(重点)

2.1 读取数据集

2.2 显示图片

2.3 图片尺寸处理

2.4 将图片转换为numpy 数组

2.5 将图像转为灰度图

2.6 数据集增强

3. 训练神经网络

4. 测试验证

5. 使用模型

6. 总结


1.导入库

在代码编写时,首先是加载本文中所需要的库,如下:

import os 
import random 
import scipy  
import skimage 
import matplotlib 

import matplotlib.pyplot as plt
import numpy as np
from scipy import ndimage
from skimage import color

import keras
from keras.models import Sequential
from keras.layers import Dense, Flatten
from keras.optimizers import SGD
from keras.layers import Dropout
from keras.optimizers import RMSprop

# Allow image embeding in notebook
%matplotlib inline

注意:在Jupyter Notebook运行上述代码时scipy库和skimage库会冲突,解决办法是通过pip包安装scipy,如下:

sudo pip3 install scipy

2. 数据处理(重点)

本文中用到的交通标志数据集是BelgiumTS数据集,将数据集解压位于notebook同一目录下,

  • data/Training/
  • data/Testing/

每个文件夹包含了62个类,编号从 00000 到 00061.

2.1 读取数据集

def load_data(data_dir):
    """Loads a data set and returns two lists:
    (加载数据,返回图像列表和标签列表)
    images: a list of Numpy arrays, each representing an image.
    labels: a list of numbers that represent the images labels.
    """
    # Get all subdirectories of data_dir. Each represents a label.
    #获取data_dir所有子文件夹,每一个包含一种标签
    #os.listdir(绝对路径),返回文件名和目录名组成的列表
    #os.path.isdir(),发【判断是否为目录
    #os.path.join(),将多个路径组合为一个路径
   
    directories = [d for d in os.listdir(data_dir) 
                   if os.path.isdir(os.path.join(data_dir, d))]
    
    #print(directories)
    
    # Loop through the label directories and collect the data in(循环遍历每一个子目录,收集数据)
    # two lists, labels and images.
    labels = []
    images = []
    for d in directories:
        label_dir = os.path.join(data_dir, d)
        file_names = [os.path.join(label_dir, f) 
                      for f in os.listdir(label_dir) if f.endswith(".ppm")]
        #print(file_names)
        
        # For each label, load it's images and add them to the images list.
        # And add the label number (i.e. directory name) to the labels list.
        for f in file_names:
            images.append(skimage.data.imread(f))
            labels.append(int(d))
    return images, labels
   


# Load training and testing datasets.
ROOT_PATH = "data"
train_data_dir = os.path.join(ROOT_PATH, "Training")
test_data_dir = os.path.join(ROOT_PATH, "Testing")

images, labels = load_data(train_data_dir)

2.2 显示图片

加载完数据集后,显示每个交通标识类别的第一张图片。

def display_images_and_labels(images, labels):
    """Display the first image of each label."""
    unique_labels = set(labels)
    plt.figure(figsize=(15, 15))
    i = 1
    for label in unique_labels:
        # Pick the first image for each label.
        image = images[labels.index(label)]
        plt.subplot(8, 8, i)  # A grid of 8 rows x 8 columns
        plt.axis('off')
        plt.title("Label {0} ({1})".format(label, labels.count(label)))
        i += 1
        _ = plt.imshow(image)
    plt.show()

display_images_and_labels(images, labels)

结果如下:


2.3 图片尺寸处理

数据集图片大小尺寸并不统一,为了更好的训练神经网络,需要将所有图片调整到一个相同的尺寸,这里将图片尺寸调整到(32,32)。

# Resize images(调整到(32,32))
images32 = [skimage.transform.resize(image, (32, 32))
                for image in images]
display_images_and_labels(images32, labels)

调整后的图片如图所示,此时图片大小尺寸已经统一。

显示调整后的图片信息:

for image in images32[:5]:
    print("shape: {0}, min: {1}, max: {2}".format(image.shape, image.min(), image.max()))
shape: (32, 32, 3), min: 0.03529411764705882, max: 0.996078431372549
shape: (32, 32, 3), min: 0.03395373774509821, max: 0.996078431372549
shape: (32, 32, 3), min: 0.03694182751225482, max: 0.996078431372549
shape: (32, 32, 3), min: 0.06460056678921586, max: 0.9191425398284314
shape: (32, 32, 3), min: 0.060355392156862725, max: 0.9028492647058823

2.4 将图片转换为numpy 数组

labels_a = np.array(labels)
images_a = np.array(images32)
print("labels: ", labels_a.shape, "\nimages: ", images_a.shape)

2.5 将图像转为灰度图

对图像进行一定的预处理,在这里将原来的RGB三通道的图像转换为灰度图。

images_a = color.rgb2gray(images_a)
display_images_and_labels(images_a, labels)


2.6 数据集增强

这里将数据集扩充为5倍,并显示其中的几类。

def expend_training_data(train_x, train_y):
    """
    Augment training data
    扩充5被
    """
    expanded_images = np.zeros([train_x.shape[0] * 5, train_x.shape[1], train_x.shape[2]])
    expanded_labels = np.zeros([train_x.shape[0] * 5])
    counter = 0
    
    for x, y in zip(train_x, train_y):
        # register original data(加载原始数据)
        expanded_images[counter, :, :] = x
        expanded_labels[counter] = y
        counter = counter + 1

        # get a value for the background
        # zero is the expected value, but median() is used to estimate background's value
        bg_value = np.median(x)  # this is regarded as background's value

        for i in range(4):
            # rotate the image with random degree
            angle = np.random.randint(-15, 15, 1)
            #new_img = ndimage.rotate(x, angle, reshape=False, cval=bg_value)

            # shift the image with random distance
            shift = np.random.randint(-2, 2, 2)
            new_img_ = ndimage.shift(x, shift, cval=bg_value)

            # register new training data
            expanded_images[counter, :, :] = new_img_
            expanded_labels[counter] = y
            counter = counter + 1

    return expanded_images, expanded_labels

images_a, labels_a = expend_training_data(images_a, labels_a)
print(images_a.shape, labels_a.shape)
labels = labels_a.tolist()
print(len(labels))
def plot_agument(images, labels):
    plt.figure(figsize=(16, 9))
    unique_labels = set(labels)
    i = 1
    for label in unique_labels:
        # Pick the first image for each label.
        if i > 3:
            break
        img_index = labels.index(label)
        for j in range(5):
            image = images_a[img_index+j]
            plt.subplot(3, 5, (i-1)*5 + j+1)  # A grid of 8 rows x 8 columns
            plt.axis('off')
            plt.title("Label {0} ({1})".format(label, labels.count(label)))
            _ = plt.imshow(image, cmap='gray')
        i += 1

plot_agument(images_a, labels)


3. 训练神经网络

这里用到keras搭建神经网络,模式为Sequential模式。

from sklearn.utils import shuffle

indx = np.arange(0, len(labels_a))
indx = shuffle(indx)
images_a = images_a[indx]
labels_a = labels_a[indx]

#总样本数为22875
train_x, val_x = images_a[:20000], images_a[20000:]
train_y, val_y = labels_a[:20000], labels_a[20000:]

train_y = keras.utils.to_categorical(train_y, 62)
val_y = keras.utils.to_categorical(val_y, 62)

#构建神经网络
model = Sequential()
model.add(Flatten(input_shape=(32, 32)))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(62, activation='softmax'))

model.summary()

model.compile(loss='categorical_crossentropy',
                  optimizer=RMSprop(),
                  metrics=['accuracy'])

history = model.fit(train_x, train_y,
                    batch_size=128,
                    epochs=20,
                    verbose=1,
                    validation_data=(val_x, val_y))

### print the keys contained in the history object
#print(history.history.keys())
model.save('model.json')

网络的训练误差和验证误差如图所示:

def plot_training(history):
    ### plot the training and validation loss for each epoch
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('model mean squared error loss')
    plt.ylabel('mean squared error loss')
    plt.xlabel('epoch')
    plt.legend(['training set', 'validation set'], loc='upper right')
    plt.show()

plot_training(history=history)

4. 测试验证

加载测试集:

# Load the test dataset.
test_images, test_labels = load_data(test_data_dir)
# Transform the images, just like we did with the training set.
test_images32 = [skimage.transform.resize(image, (32, 32))
                 for image in test_images]

test_images_a = np.array(test_images32)
test_labels_a = np.array(test_labels)

test_images_a = color.rgb2gray(test_images_a)

display_images_and_labels(test_images_a, test_labels)

test_x = test_images_a
test_y = keras.utils.to_categorical(test_labels_a, 62)
score = model.evaluate(test_x, test_y, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

测试精度为:88%。

Test loss: 0.49647544848156117
Test accuracy: 0.8789682388305664

5. 使用模型

这里选取几张图片样本,进行测试,测试结果如图所示,其中有2张图片测试结构与标签不一致。

# Display the predictions and the ground truth visually.
fig = plt.figure(figsize=(10, 10))
j = 1
for i in range(0, 1000, 100):
    truth = test_labels_a[i]
    prediction = predicted[i]
    plt.subplot(5, 2, j)
    j = j+1 
    plt.axis('off')
    color='green' if truth == prediction else 'red'
    plt.text(40, 10, "Truth:        {0}\nPrediction: {1}".format(truth, prediction), 
             fontsize=12, color=color)
    plt.imshow(test_x[i], cmap='gray')


6. 总结

本文利用深度学习进行了交通标志识别,首先进行了数据的处理,然后训练了神经网络,最后进行了测试,测试精度达到88%主要的工作是图像数据处理。为进一步提高精度,可考虑使用卷积神经网络

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章