動手學無人駕駛(2):車輛檢測

上一篇博客介紹了無人駕駛中使用深度學習在交通標誌識別中的應用(動手學無人駕駛(1):交通標誌識別)。

本文介紹如何使用深度學習進行車輛檢測,使用到的模型是YOLO模型,關於YOLO模型的具體檢測原理,可以參考吳恩達老師的深度學習課程視頻。課程鏈接是:https://www.deeplearning.ai/deep-learning-specialization/。

之前的一篇博客中也對YOLO的原理進行了詳細介紹:13.深度學習練習:Autonomous driving - Car detection(YOLO實戰)

目錄

1.導入庫和數據

2.分類過濾

3.非最大抑制

4.評估模型

5.測試

1)模型輸出轉換爲可用邊界框張量

2)選取最佳框

3)車輛檢測

6.參考資料


1.導入庫和數據

在本文中我們將使用到一個預訓練模型,用它來檢測數據集上的車輛

在本文中我們將嘗試檢測80個類別,並使用5個錨盒。 文件“ coco_classes.txt”和“ yolo_anchors.txt”中收集了有關80個類和5個定位框的信息。

首先是加載這些信息,同時爲了方便處理,對圖片進行了預處理(圖片尺寸大小爲720x1280)。

import argparse
import os
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
import scipy.io
import scipy.misc
import numpy as np
import pandas as pd
import PIL
import tensorflow as tf
from keras import backend as K
from keras.layers import Input, Lambda, Conv2D
from keras.models import load_model, Model
from yolo_utils import read_classes, read_anchors, generate_colors, preprocess_image, draw_boxes, scale_boxes
from yad2k.models.keras_yolo import yolo_head, yolo_boxes_to_corners, preprocess_true_boxes, yolo_loss, yolo_body
 
%matplotlib inline

class_names = read_classes("model_data/coco_classes.txt")
anchors = read_anchors("model_data/yolo_anchors.txt")
image_shape = (720., 1280.)    

2.分類過濾

因爲最終輸出爲80個分類的預測,這裏需要對其進行過濾,即選取預測概率值前五的類別。

yolo_filter_boxes函數中定義了以下參數:,閾值這裏爲0.6

  • box_confidence:形狀爲含有pc的張量(19x19,5,1),pc表示所預測的5個boxes中含有目標;
  • boxes: 形狀爲含有(bx,by,bh,bw)的張量(19x19,5,4);
  • box_class_probs: 形狀爲含有(c1,c2,...,c80)的張量(19×19,5,80), c1,c2,...c80表示爲預測類別的概率。
def yolo_filter_boxes(box_confidence, boxes, box_class_probs, threshold = .6):
    
    返回值:
    scores -- 得分
    boxes --  最終選出的錨盒
    classes -- 預測類別

    # Step 1: 計算得分
    box_scores = box_confidence * box_class_probs
    
    # Step 2: 根據得分選取類別
    box_classes = K.argmax(box_scores, axis = -1)
    box_class_scores = K.max(box_scores, axis = -1)
    
    # Step 3:根據閾值設置mask
    filtering_mask = (box_class_scores > threshold )
    
    # Step 4: 最終預測結果
    scores = tf.boolean_mask(box_class_scores, filtering_mask)
    boxes = tf.boolean_mask(boxes, filtering_mask)
    classes = tf.boolean_mask(box_classes, filtering_mask)
    
    return scores, boxes, classes

3.非最大抑制

在上一節的閾值過濾後,會存在許多相互重疊的框,如下圖所示。 爲了選擇最正確的目標框這裏需要用到第二個過濾器:即非最大抑制(NMS)。

非最大抑制使用名爲IOU的函數:

在此代碼中,我們使用以下約定:(0,0)是圖像的左上角,(1,0)是右上角,(1,1)是右下角。

def iou(box1, box2):
    參數s:
    box1 -- first box, list object with coordinates (x1, y1, x2, y2)
    box2 -- second box, list object with coordinates (x1, y1, x2, y2)
    """
 
    # 重疊區域面積
    xi1 = np.maximum(box1[0], box2[0])
    yi1 = np.maximum(box1[1], box2[1])
    xi2 = np.minimum(box1[2], box2[2])
    yi2 = np.minimum(box1[3], box2[3])
    inter_area = (xi2 - xi1)*(yi2 - yi1)
 
    # 整個區域面積
    box1_area = (box1[2] - box1[0]) * (box1[3] - box1[1])
    box2_area = (box2[2] - box2[0]) * (box2[3] - box2[1])
    union_area = box2_area + box1_area - inter_area
 
    # 輸出IOU
    iou = inter_area / union_area
 
    return iou

現在對上一節輸出的結果進行非最大值抑制:

def yolo_non_max_suppression(scores, boxes, classes, max_boxes = 10, iou_threshold = 0.5):
    
    參數:
    scores -- tensor of shape (None,), output of yolo_filter_boxes()
    boxes -- tensor of shape (None, 4), output of yolo_filter_boxes() that have been scaled to the image size (see later)
    classes -- tensor of shape (None,), output of yolo_filter_boxes()
    max_boxes -- integer, maximum number of predicted boxes you'd like
    iou_threshold -- real value, "intersection over union" threshold used for NMS filtering
    
    返回值:
    scores -- tensor of shape (, None), predicted score for each box
    boxes -- tensor of shape (4, None), predicted box coordinates
    classes -- tensor of shape (, None), predicted class for each box
    
    max_boxes_tensor = K.variable(max_boxes, dtype='int32')     # tensor to be used in tf.image.non_max_suppression()
    K.get_session().run(tf.variables_initializer([max_boxes_tensor])) # initialize variable max_boxes_tensor
    
    # Use tf.image.non_max_suppression() to get the list of indices corresponding to boxes you keep
   
    nms_indices = tf.image.non_max_suppression(boxes, scores, max_boxes, iou_threshold)
    
    # Use K.gather() to select only nms_indices from scores, boxes and classes
    scores = K.gather(scores, nms_indices)
    boxes = K.gather(boxes, nms_indices)
    classes = K.gather(classes, nms_indices)
  
    
    return scores, boxes, classes

4.評估模型

運用之前編寫的函數進行模型評估。

def yolo_eval(yolo_outputs, image_shape = (720., 1280.), max_boxes=10, score_threshold=.6, iou_threshold=.5):
      
    參數s:
    yolo_outputs -- output of the encoding model (for image_shape of (608, 608, 3)), contains 4 tensors:
                    box_confidence: tensor of shape (None, 19, 19, 5, 1)
                    box_xy: tensor of shape (None, 19, 19, 5, 2)
                    box_wh: tensor of shape (None, 19, 19, 5, 2)
                    box_class_probs: tensor of shape (None, 19, 19, 5, 80)
    image_shape -- tensor of shape (2,) containing the input shape, in this notebook we use (608., 608.) (has to be float32 dtype)
    max_boxes -- integer, maximum number of predicted boxes you'd like
    score_threshold -- real value, if [ highest class probability score < threshold], then get rid of the corresponding box
    iou_threshold -- real value, "intersection over union" threshold used for NMS filtering
    
    返回值:
    scores -- tensor of shape (None, ), predicted score for each box
    boxes -- tensor of shape (None, 4), predicted box coordinates
    classes -- tensor of shape (None,), predicted class for each box
    """
      
    # YOLO模型的輸出
    box_confidence, box_xy, box_wh, box_class_probs = yolo_outputs
 
    # 輸出boxes
    boxes = yolo_boxes_to_corners(box_xy, box_wh)
 
    # 閾值過濾
    scores, boxes, classes = yolo_filter_boxes(box_confidence, boxes, box_class_probs, score_threshold)
    
    # Scale boxes back to original image shape.
    boxes = scale_boxes(boxes, image_shape)
 
    # 非最大值抑制
    scores, boxes, classes = yolo_non_max_suppression(scores, boxes, classes, max_boxes, iou_threshold)
    
    return scores, boxes, classes

5.測試

訓練YOLO模型需要花費很長時間,並且需要帶有目標類別的標記邊界框的數據集。

這裏選擇加載存儲在“ yolo.h5”中的現有預訓練的Keras YOLO模型。 (這些權重來自YOLO官方網站,並使用Allan Zelener編寫的函數進行了轉換。)

yolo_model = load_model("model_data/yolov2.h5")
 
yolo_model.summary()

1)模型輸出轉換爲可用張量

yolo_model的輸出是(m,19,19,5,85)張量。

yolo_outputs = yolo_head(yolo_model.output, anchors, len(class_names))

2)選取最佳框

yolo_outputs以正確格式提供了yolo_model的所有預測框。 現在,可以執行過濾並僅選擇最佳框。

scores, boxes, classes = yolo_eval(yolo_outputs, image_shape)

3)車輛檢測

下面是整個的處理過程:

  •      yolo_model.input被賦予yolo_model。 該模型用於計算輸出yolo_model.output
  •      yolo_model.output由yolo_head處理。 給出yolo_outputs
  •      yolo_outputs通過過濾功能yolo_eval。 它輸出預測:分數,方框,類

下面給出預測代碼,以及檢測結果。

def predict(sess, image_file):
  
    
    參數:
    sess -- your tensorflow/Keras session containing the YOLO graph
    image_file -- name of an image stored in the "images" folder.
    
    返回:
    out_scores -- tensor of shape (None, ), scores of the predicted boxes
    out_boxes -- tensor of shape (None, 4), coordinates of the predicted boxes
    out_classes -- tensor of shape (None, ), class index of the predicted boxes
    
    """
 
    # 圖片預處理
    image, image_data = preprocess_image("images/" + image_file, model_image_size = (608, 608))
 
    out_scores, out_boxes, out_classes = sess.run([scores, boxes, classes], feed_dict = {yolo_model.input:image_data, K.learning_phase(): 0})
 
    # 打印預測信息
    print('Found {} boxes for {}'.format(len(out_boxes), image_file))
    # Generate colors for drawing bounding boxes.
    colors = generate_colors(class_names)
    # Draw bounding boxes on the image file
    draw_boxes(image, out_scores, out_boxes, out_classes, class_names, colors)
    # Save the predicted bounding box on the image
    image.save(os.path.join("out", image_file), quality=90)
    # Display the results in the notebook
    output_image = scipy.misc.imread(os.path.join("out", image_file))
    imshow(output_image)
    
    return out_scores, out_boxes, out_classes


6.參考資料

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章