3D目標檢測的實現(CV+TF)

相對於傳統的2D目標檢測,最近一段時間有機會接觸到了3D打印設備,準備對3D打印的產品進行實時目標檢測,下面分享一些我學到的經驗。
硬件:ip攝像頭
系統:Windows
軟件:open cv

數據感知模塊

攝像頭採集需要安裝opencv_python
下載連接:https://www.lfd.uci.edu/~gohlke/pythonlibs/#opencv

開始要先確定好所使用的攝像頭。
對攝像頭的ip修改成自己PC的ip
URL:Uniform Resource Locator,“統一資源定位符”,可以從互聯網上得到資源的位置並訪問,是互聯網上標準資源的地址。
使用界面(瀏覽器)調用攝像頭,把攝像頭的ip輸入到頁面上,登錄攝像頭界面
然後對攝像頭的每一幀圖片進行採集和保存,如果是視頻流的話,需要修改RTSP(實時傳輸協議),修改後最好重啓一下攝像頭。

#再將以下代碼重新運行一下
import cv2
url = 'rtsp://admin:[email protected]:554/11'
cap = cv2.VideoCapture(url)
while(cap.isOpened()):  
    # 獲取一幀
    ret, frame = cap.read()  
    # 顯示結果幀  
    cv2.imshow('frame',frame)  
    if cv2.waitKey(1) & 0xFF == ord('q'):  
        break  
# 完成後,釋放捕獲 
cap.release()  
cv2.destroyAllWindows()

數據集的製作

目標檢測的數據類型包括2D的RGB圖像,2.5D的RGB-D圖像以及3D的點雲。
RGB圖像高像素的特徵,可以捕捉到更多的細節,但是缺乏3D信息,可以用比較成熟的CNN算法實現
RGBD圖像具有3D信息,相對稠密,但受傳感器影響大。可以結合相機內參轉換爲3D點雲,因此其既可以用CNN,也可以用基於點雲的DNN。
點雲具有精確的3D信息,但太過稀疏。其中點雲的表現形式,主要有體素化(voxelize)(用於訓練3D-CNN)、原始點雲(raw)(使用針對點雲的DNN,例如PointNet、PointCNN等)、前視圖(Front View)(對垂直空間進行劃分,得到多層)、鳥瞰圖(Bird Eye View, BEV)(使用傳統CNN)。

模型訓練

檢測demo

from object_detection.utils import visualization_utils as vis_util
from object_detection.utils import label_map_util
from distutils.version import StrictVersion
import tensorflow as tf
import numpy as np
import cv2

if StrictVersion(tf.__version__) < StrictVersion('1.9.0'):
    raise ImportError('Please upgrade your TensorFlow installation to v1.9.* or later!')    
# 開啓攝像頭
cap = cv2.VideoCapture(0)
# 添加模型位置和標籤配置文件位置
PATH_TO_FROZEN_GRAPH = ''
PATH_TO_LABELS = ''

# 載入模型
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

with detection_graph.as_default():
    with tf.Session(graph=detection_graph) as sess:
        while True:
            ret, image_np = cap.read()
            # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
            image_np_expanded = np.expand_dims(image_np, axis=0)
            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
            # Each box represents a part of the image where a particular object was detected.
            boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
            # Each score represent how level of confidence for each of the objects.
            # Score is shown on the result image, together with the class label.
            scores = detection_graph.get_tensor_by_name('detection_scores:0')
            classes = detection_graph.get_tensor_by_name('detection_classes:0')
            num_detections = detection_graph.get_tensor_by_name('num_detections:0')
            # Actual detection.
            (boxes, scores, classes, num_detections) = sess.run(
                [boxes, scores, classes, num_detections],
                feed_dict={image_tensor: image_np_expanded})
            # Visualization of the results of a detection.
            vis_util.visualize_boxes_and_labels_on_image_array(
                image_np, np.squeeze(boxes),
                np.squeeze(classes).astype(np.int32),
                np.squeeze(scores), category_index,
                use_normalized_coordinates=True,
                line_thickness=8)

            cv2.imshow('object detection', image_np)
            if cv2.waitKey(25) & 0xFF == ord('q'):
                cv2.destroyAllWindows()
                break
cap.release()
cv2.destroyAllWindows()
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章