基於MTCNN+CNN的疲勞(危險)駕駛檢測系統設計

一.設計原理

在人臉檢測中應用較廣的算法就是MTCNN( Multi-task Cascaded Convolutional Networks的縮寫)。MTCNN算法是一種基於深度學習的人臉檢測和人臉對齊方法,它可以同時完成人臉檢測和人臉對齊的任務,相比於傳統的算法,它的性能更好,檢測速度更快。
本文目的不是爲了強調MTCNN模型的訓練,而是如何使用MTCNN提取人臉區域和特徵點,然後依據“三庭五眼”的理論把人臉分成眼睛、嘴巴、耳朵三個區域後將圖像傳入訓練好的圖像分類模型,判斷閉眼、睜眼、張嘴、閉嘴、吸菸、電話,在疲勞判斷上使用PERCLOS的判斷標準,將結果反饋。

二.MTCNN人臉檢測

MTCNN 包含三個級聯的多任務卷積神經網絡,分別是 Proposal Network (P-Net)、Refine Network (R-Net)、Output Network (O-Net),每個多任務卷積神經網絡均有三個學習任務,分別是人臉分類、邊框迴歸和關鍵點定位。MTCNN模塊我用的是GitHub的一個項目,通過檢測到人臉的五個關鍵點,定義左、右眼中心點連線與水平方向的夾角爲 θ,眼部區域寬度爲W, 高度爲 H=w/2。從鼻尖點位 C 向左右嘴角連線作垂線, 記垂距爲D。嘴部區域上、下沿分別取該垂線及其延長線上D/2 和 3D/2 處。具體可以看代碼`

left_eye=keypoints['left_eye']
right_eye=keypoints['right_eye']
nose=keypoints['nose']
mouth_left=keypoints['mouth_left']
mouth_right=keypoints['mouth_right']
arc = atan(abs(right_eye[1] - left_eye[1]) / abs(right_eye[0] - left_eye[0]))
W = abs(right_eye[0] - left_eye[0]) / (2 * cos(arc))
H = W / 2
D=(mouth_left[1]-nose[1])/cos(arc)-(mouth_left[1]-mouth_right[1])/(2*cos(arc))

`

三.CNN卷積神經網絡圖像識別

將數據分類成open_eye,closed_eye,open_mouth,closed_mouth,smoke,call這些類別,通過訓練得到一個圖像分類模型即可結合mtcnn裁剪的目標區域,識別得出結果

包含 3 個卷積層,3 個池化 層和 2 個全連接層。3 個卷積層中分別包含 32、32 和 64 個 卷積核,卷積核尺寸大小爲 5。
模型代碼如下:

from keras.models import Sequential
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.convolutional import AveragePooling2D
from keras.initializers import TruncatedNormal
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dropout
from keras.layers.core import Dense
from keras import backend as K

class EAMNET:
    @staticmethod
    def build(width, height, depth, classes):
        model = Sequential()
        inputShape = (height, width, depth)
        chanDim = -1

        if K.image_data_format() == "channels_first":
            inputShape = (depth, height, width)
            chanDim = 1

        # CONV => RELU => POOL
        model.add(Conv2D(32, (5, 5), padding="same",
            input_shape=inputShape,kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.01)))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=chanDim))
        model.add(MaxPooling2D(pool_size=(2, 2)))
        model.add(Dropout(0.25))

        # (CONV => RELU) * 2 => POOL
        model.add(Conv2D(32, (5, 5), padding="same",kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.01)))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=chanDim))
        model.add(MaxPooling2D(pool_size=(2, 2)))
        model.add(Dropout(0.25))

        # (CONV => RELU) * 3 => POOL
        model.add(Conv2D(64, (3, 3), padding="same",kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.01)))
        model.add(Activation("relu"))
        model.add(BatchNormalization(axis=chanDim))
        model.add(MaxPooling2D(pool_size=(2, 2)))
        model.add(Dropout(0.25))

        # FC層
        model.add(Flatten())
        model.add(Dense(64,kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.01)))
        model.add(Activation("relu"))
        model.add(BatchNormalization())
        model.add(Dropout(0.6))

        # softmax 分類
        model.add(Dense(classes,kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.01)))
        model.add(Activation("softmax"))

        return model
model=EAMNET.build(32,32,1,6)
model.summary()

訓練代碼如下:

# set the matplotlib backend so figures can be saved in the background
import keras
import matplotlib
from keras.callbacks import ReduceLROnPlateau, EarlyStopping, ModelCheckpoint
from keras.engine.saving import load_model
from keras.utils import to_categorical
from sklearn.metrics import classification_report

from SimpleVGGNet import SimpleVGG
from EAMNet import EAMNET

matplotlib.use("Agg")

# import the necessary packages
from keras.preprocessing.image import ImageDataGenerator
from keras.optimizers import Adam
from keras.preprocessing.image import img_to_array
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from imutils import paths
import numpy as np
import argparse
import random
import pickle
import cv2
import os

dataset="./dataset/"
EPOCHS =100
INIT_LR = 0.01
BS = 64
IMAGE_DIMS = (64, 64, 1)
classnum=2
# initialize the data and labels
data = []
labels = []

# grab the image paths and randomly shuffle them
print("[INFO] loading images...")
imagePaths = sorted(list(paths.list_images(dataset)))
# print(imagePaths)
random.seed(10010)
random.shuffle(imagePaths)

# loop over the input images
for imagePath in imagePaths:
    # load the image, pre-process it, and store it in the data list
    image = cv2.imread(imagePath,cv2.IMREAD_GRAYSCALE)
    print(imagePath)
    image = cv2.resize(image, (IMAGE_DIMS[1], IMAGE_DIMS[0]))
    image = img_to_array(image)
    data.append(image)

    # extract the class label from the image path and update the
    # labels list
    label = imagePath.split(os.path.sep)[-2]
    labels.append(label)
# scale the raw pixel intensities to the range [0, 1]
print(labels)
data = np.array(data, dtype="float") / 255.0
labels = np.array(labels)
print("[INFO] data matrix: {:.2f}MB".format(
    data.nbytes / (1024 * 1000.0)))

# 數據集切分
(trainX, testX, trainY, testY) = train_test_split(data,labels, test_size=0.25, random_state=42)

# 轉換標籤爲one-hot encoding格式
lb = LabelBinarizer()
print(lb)
trainY = lb.fit_transform(trainY)
testY = lb.fit_transform(testY)

trainY = to_categorical(trainY)
testY = to_categorical(testY)
# construct the image generator for data augmentation
aug = ImageDataGenerator(rotation_range=25, width_shift_range=0.1,
                         height_shift_range=0.1, shear_range=0.2, zoom_range=0.2,
                         horizontal_flip=True, fill_mode="nearest")
# initialize the model
print("[INFO] compiling model...")
model = SimpleVGG.build(width=IMAGE_DIMS[1], height=IMAGE_DIMS[0],
                            depth=IMAGE_DIMS[2], classes=classnum)
# model=load_model('./model/best0428ep150.h5')
opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)
model.summary()
model.compile(loss="categorical_crossentropy", optimizer=opt,
              metrics=["accuracy"])
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=5, verbose=1)
early_stopping = EarlyStopping(monitor='val_loss', min_delta=0, patience=10, verbose=1)
# train the network
filepath="./model/best0428ep150.h5"
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True,mode='max')

H = model.fit_generator(
    aug.flow(trainX, trainY, batch_size=BS),
    validation_data=(testX, testY),
    steps_per_epoch=len(trainX) // BS,
   callbacks=[reduce_lr,checkpoint],
    epochs=EPOCHS, verbose=1)
# save the model to disk
model.save('./model/best0428ep150.h5')
# plot the training loss and accuracy
# 測試
print("------測試網絡------")
predictions = model.predict(testX, batch_size=32)
print(classification_report(testY.argmax(axis=1),
    predictions.argmax(axis=1), target_names=lb.classes_))

plt.style.use("ggplot")
plt.figure()
N = EPOCHS
plt.plot(np.arange(0, N), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss")
plt.plot(np.arange(0, N), H.history["acc"], label="train_acc")
plt.plot(np.arange(0, N), H.history["val_acc"], label="val_acc")
plt.title("Training Loss and Accuracy")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend(loc="upper left")
plt.savefig("./model/best0428ep150.png")
f = open("./model/best0428ep150.pickle", "wb")
f.write(pickle.dumps(lb))
f.close()

四.疲勞判斷

PERCLOS的測量的參數是指在單位時間內眼睛閉合程度超過某一閉值(70%、80%)的時間佔總時間的百分比。PERCLOS方法的常用標準如下:
P7O:指眼瞼遮住瞳孔的面積超過70%就計爲眼睛閉合,統計在一定時間內眼睛閉合時所佔的時間比例。
P80:指眼瞼遮住瞳孔的面積超過80%就計爲眼睛閉合,統計在一定時間內眼睛閉合時所佔的時間比例。
所以可以通過一定時間內判斷眼睛閉合的幀/統計的總幀數判斷是否達到了疲勞,結合人打哈欠的正常指標,如果超出閾值就判斷爲哈欠。

五.效果演示

在這裏插入圖片描述
在這裏插入圖片描述

六.項目地址

項目GitHub地址MTCNN_CNN_DangerDrivingDetection
如果覺得有幫助的話點贊star一波,Thanks♪(・ω・)ノ
另外還有使用SSD目標檢測算法做的疲勞駕駛檢測系統還沒有寫文章,如果有需要也可以直接看項目
MTCNN模塊使用了開源的項目,也可以自己訓練。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章