Faster RCNN 基於 OpenCV DNN 的目標檢測實現

原文：Faster RCNN 基於 OpenCV DNN 的目標檢測實現 - AIUAI

在前面已經測試過 YOLOV3 和 SSD 基於 OpenCV DNN 的目標檢測實現，這裏再簡單實現下 Faster RCNN 基於 DNN 的實現.

YOLOV3 基於OpenCV DNN 的目標檢測實現 - AIUAI

TensorFlow 目標檢測模型轉換爲 OpenCV DNN 可調用格式 - AIUAI

1. Faster RCNN 模型下載

直接從 OpenCV DNN 提供的模型 weights 文件和 config 文件鏈接下載：

Model	Version
Faster-RCNN Inception v2	2018_01_28	weights	config
Faster-RCNN ResNet-50	2018_01_28	weights	config

或者，根據 TensorFlow 目標檢測模型轉換爲 OpenCV DNN 可調用格式 - AIUAI 中的說明，自己進行模型轉化. 如果是基於 TensorFlow 對定製數據集訓練的模型，則採用這種方法.

這裏以 faster_rcnn_resnet50_coco_2018_01_28 模型爲例，手工得到 graph.pbtxt 文件，進行測試.

2. Faster RCNN DNN 實現之一

#!/usr/bin/python
#!--*-- coding:utf-8 --*--
import cv2
import matplotlib.pyplot as plt


pb_file = '/path/to/faster_rcnn_resnet50_coco_2018_01_28/frozen_inference_graph.pb'
pbtxt_file = '/path/to/faster_rcnn_resnet50_coco_2018_01_28/graph.pbtxt'
net = cv2.dnn.readNetFromTensorflow(pb_file, pbtxt_file)

score_threshold = 0.3

img_file = "test.jpg"

img_cv2 = cv2.imread(img_file)
height, width, _ = img_cv2.shape
net.setInput(cv2.dnn.blobFromImage(img_cv2,
                                   size=(300, 300),
                                   swapRB=True,
                                   crop=False))

out = net.forward()
print(out)

for detection in out[0, 0, :,:]:
    score = float(detection[2])
    if score > score_threshold:
        left = detection[3] * width
        top = detection[4] * height
        right = detection[5] * width
        bottom = detection[6] * height
        cv2.rectangle(img_cv2,
                      (int(left), int(top)),
                      (int(right), int(bottom)),
                      (23, 230, 210),
                      thickness=2)

t, _ = net.getPerfProfile()
label = 'Inference time: %.2f ms' % \
            (t * 1000.0 / cv2.getTickFrequency())
cv2.putText(img_cv2, label, (0, 15),
            cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255))


plt.figure(figsize=(10, 8))
plt.imshow(img_cv2[:, :, ::-1])
plt.title("OpenCV DNN Faster RCNN-ResNet50")
plt.axis("off")
plt.show()

3. Faster RCNN DNN 實現之二

#!/usr/bin/python3
# -*- coding: utf-8 -*-
import cv2
import os
import matplotlib.pyplot as plt
import time


class general_faster_rcnn(object):
    def __init__(self, modelpath):
        self.conf_threshold = 0.3   # Confidence threshold
        self.nms_threshold  = 0.4   # Non-maximum suppression threshold
        self.net_width  = 416 # 300 # Width of network's input image
        self.net_height = 416 # 300 # Height of network's input image

        self.classes = self.get_coco_names()
        self.faster_rcnn_model = self.get_faster_rcnn_model(modelpath)
        self.outputs_names = self.get_outputs_names()


    def get_coco_names(self):
        classes = ["person", "bicycle", "car", "motorcycle", "airplane", 
                   "bus", "train", "truck", "boat", "traffic light", 
                   "fire hydrant", "background", "stop sign", "parking meter", 
                   "bench", "bird", "cat", "dog", "horse", "sheep", "cow", 
                   "elephant", "bear", "zebra", "giraffe", "background", 
                   "backpack", "umbrella", "background", "background", 
                   "handbag", "tie", "suitcase", "frisbee", "skis", 
                   "snowboard", "sports ball", "kite", "baseball bat", 
                   "baseball glove", "skateboard", "surfboard", "tennis racket",
                   "bottle", "background", "wine glass", "cup", "fork", "knife", 
                   "spoon", "bowl", "banana", "apple", "sandwich", "orange", 
                   "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", 
                   "chair", "couch", "potted plant", "bed", "background", 
                   "dining table", "background", "background", "toilet",
                   "background", "tv", "laptop", "mouse", "remote", "keyboard",
                   "cell phone", "microwave", "oven", "toaster", "sink", 
                   "refrigerator", "background", "book", "clock", "vase", 
                   "scissors", "teddy bear", "hair drier", "toothbrush",
                   "background" ]

        return classes


    def get_faster_rcnn_model(self, modelpath):
        pb_file = os.path.join(modelpath, "frozen_inference_graph.pb")
        pbtxt_file = os.path.join(modelpath, "graph.pbtxt")

        net = cv2.dnn.readNetFromTensorflow(pb_file, pbtxt_file)
        net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
        net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)

        return net


    def get_outputs_names(self):
        # 網絡中所有網絡層的名字
        layersNames = self.faster_rcnn_model.getLayerNames()
        # 網絡輸出層的名字，如，沒有鏈接輸出的網絡層.

        return [layersNames[i[0] - 1] for i in \
                self.faster_rcnn_model.getUnconnectedOutLayers()]


    # NMS 處理掉低 confidence 的邊界框.
    def postprocess(self, img_cv2, outputs):
        img_height, img_width, _ = img_cv2.shape

        class_ids = []
        confidences = []
        boxes = []
        for output in outputs:
            for detection in output[0, 0]:
                # [batch_id, class_id, confidence, left, top, right, bottom]
                confidence = detection[2]
                if confidence > self.conf_threshold:
                    left = int(detection[3]*img_width)
                    top = int(detection[4]*img_height)
                    right = int(detection[5]*img_width)
                    bottom = int(detection[6]*img_height)
                    width = right - left + 1
                    height = bottom - top + 1

                    class_ids.append(int(detection[1]))
                    confidences.append(float(confidence))
                    boxes.append([left, top, width, height])


        # NMS 處理
        indices = cv2.dnn.NMSBoxes(boxes, 
                                   confidences, 
                                   self.conf_threshold, 
                                   self.nms_threshold)

        results = []
        for ind in indices:
            res_box = {}
            res_box["class_id"] = class_ids[ind[0]]
            res_box["score"]    = confidences[ind[0]]

            box = boxes[ind[0]]
            res_box["box"] = (box[0], box[1], box[0]+box[2], box[1]+box[3])

            results.append(res_box)

        return results


    def predict(self, img_file):
        img_cv2 = cv2.imread(img_file)

        # 創建 4D blob.
        blob = cv2.dnn.blobFromImage(
            img_cv2, 
            size=(self.net_width, self.net_height), 
            swapRB=True, crop=False)

        # 設置網絡的輸入 blob 
        self.faster_rcnn_model.setInput(blob)

        # 打印網絡的輸出層名
        print("[INFO]Net output layers: {}".format(self.outputs_names))

        # Runs forward
        outputs = self.faster_rcnn_model.forward(self.outputs_names)
		
        # NMS 
        results = self.postprocess(img_cv2, outputs)

        return results


    def vis_res(self, img_file, results):
        img_cv2 = cv2.imread(img_file)

        for result in results:
            left, top, right, bottom = result["box"]
            cv2.rectangle(img_cv2, 
                          (left, top), 
                          (right, bottom), 
                          (255, 178, 50), 3)

            # Get the label for the class name and its confidence
            label = '%.2f' % result["score"]
            if self.classes:
                assert (result["class_id"] < len(self.classes))
                label = '%s:%s' % (self.classes[result["class_id"]], label)

            label_size, baseline = cv2.getTextSize(
                label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)
            top = max(top, label_size[1])
            cv2.rectangle(
                img_cv2, 
                (left, top - round(1.5 * label_size[1])),
                (left + round(1.5 * label_size[0]), top + baseline), 
                (255, 0, 0),
                cv2.FILLED)
            cv2.putText(img_cv2, 
                        label, 
                        (left, top), 
                        cv2.FONT_HERSHEY_SIMPLEX, 
                        0.75, (0, 0, 0), 1)

        t, _ = self.faster_rcnn_model.getPerfProfile()
        label = 'Inference time: %.2f ms' % \
        	(t * 1000.0 / cv2.getTickFrequency())
        cv2.putText(img_cv2, label, (0, 15), 
                    cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255))


        plt.figure(figsize=(10, 8))
        plt.imshow(img_cv2[:,:,::-1])
        plt.title("OpenCV DNN Faster RCNN-ResNet50")
        plt.axis("off")
        plt.show()


if __name__ == '__main__':
    print("[INFO]Faster RCNN object detection in OpenCV.")

    img_file = "test.jpg"

    start = time.time()
    modelpath = "/path/to/faster_rcnn_resnet50_coco_2018_01_28/"
    faster_rcnn_model = general_faster_rcnn(modelpath)
    print("[INFO]Model loads time: ", time.time() - start)

    start = time.time()
    results = faster_rcnn_model.predict(img_file)
    print("[INFO]Model predicts time: ", time.time() - start)
    faster_rcnn_model.vis_res(img_file, results)

網絡輸入爲 (300, 300) 時，目標檢測結果爲(與實現之一中的結果一致.)：

網絡輸入爲 (416, 416) 時，目標檢測結果爲(提高輸入圖片分辨率有助於提升檢測結果)：

4. Faster RCNN TensorFlow 實現

採用 TensorFlow 目標檢測 API 進行模型測試：

#!/usr/bin/python3
#!--*-- coding:utf-8 --*--
import os
import numpy as np
import cv2
import matplotlib.pyplot as plt
import tensorflow as tf


model_path = "/path/to/faster_rcnn_resnet50_coco_2018_01_28"
frozen_pb_file = os.path.join(model_path, 'frozen_inference_graph.pb')

score_threshold = 0.3

img_file = "test.jpg"

# Read the graph.
with tf.gfile.FastGFile(frozen_pb_file, 'rb') as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())


with tf.Session() as sess:
    # Restore session
    sess.graph.as_default()
    tf.import_graph_def(graph_def, name='')

    # Read and preprocess an image.
    img_cv2 = cv2.imread(img_file)
    img_height, img_width, _ = img_cv2.shape

    img_in = cv2.resize(img_cv2, (416, 416))
    img_in = img_in[:, :, [2, 1, 0]]  # BGR2RGB

    # Run the model
    outputs = sess.run(
        [sess.graph.get_tensor_by_name('num_detections:0'),
         sess.graph.get_tensor_by_name('detection_scores:0'),
         sess.graph.get_tensor_by_name('detection_boxes:0'),
         sess.graph.get_tensor_by_name('detection_classes:0')],
        feed_dict={'image_tensor:0': img_in.reshape(
            1, img_in.shape[0], img_in.shape[1], 3)})

    # Visualize detected bounding boxes.
    num_detections = int(outputs[0][0])
    for i in range(num_detections):
        classId = int(outputs[3][0][i])
        score = float(outputs[1][0][i])
        bbox = [float(v) for v in outputs[2][0][i]]
        if score > score_threshold:
            x = bbox[1] * img_width
            y = bbox[0] * img_height
            right = bbox[3] * img_width
            bottom = bbox[2] * img_height
            cv2.rectangle(img_cv2, 
                          (int(x), int(y)), 
                          (int(right), int(bottom)), 
                          (125, 255, 51), 
                          thickness=2)

plt.figure(figsize=(10, 8))
plt.imshow(img_cv2[:, :, ::-1])
plt.title("TensorFlow Faster RCNN-ResNet50")
plt.axis("off")
plt.show()

目標檢測結果如：

採用 TensorFlow 目標檢測 API 對於相同的 (300, 300) 網絡輸入，得到的結果好像比 DNN 更好一些，原因暫未知.

Faster RCNN 基於 OpenCV DNN 的目標檢測實現

1. Faster RCNN 模型下載

2. Faster RCNN DNN 實現之一

3. Faster RCNN DNN 實現之二

4. Faster RCNN TensorFlow 實現

EXCEL中下拉菜單中添加新選項或者刪除選項

號稱能打敗MLP的KAN到底行不行？數學核心原理全面解析

Git使用經驗總結5-修改提交信息

Python 爬蟲：Spring Boot 反爬蟲的成功案例

京東科技數字化營銷能力的演進與最佳實踐| 京東雲技術團隊

Java中止線程的方式

[轉帖]Oracle Exadata 學習筆記之核心特性Part1

《最新出爐》系列入門篇-Python+Playwright自動化測試-43-分頁測試

HTTP協議相關文檔

OpenCV - 計算相機和視頻的幀速率FPS

時間序列數據的存儲和計算-知乎系列介紹

Faster RCNN 基於 OpenCV DNN 的目標檢測實現

OpenCV4.X - DNN模塊 Python APIs

歡迎訪問自建博客 AIUAI.CN - https://www.aiuai.cn 交流學習

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結