3D目標檢測的實現（CV+TF）

相對於傳統的2D目標檢測，最近一段時間有機會接觸到了3D打印設備，準備對3D打印的產品進行實時目標檢測，下面分享一些我學到的經驗。
硬件：ip攝像頭
系統：Windows
軟件：open cv

數據感知模塊

攝像頭採集需要安裝opencv_python
下載連接:https://www.lfd.uci.edu/~gohlke/pythonlibs/#opencv

開始要先確定好所使用的攝像頭。
對攝像頭的ip修改成自己PC的ip
URL：Uniform Resource Locator，“統一資源定位符”，可以從互聯網上得到資源的位置並訪問，是互聯網上標準資源的地址。
使用界面（瀏覽器）調用攝像頭，把攝像頭的ip輸入到頁面上，登錄攝像頭界面
然後對攝像頭的每一幀圖片進行採集和保存，如果是視頻流的話，需要修改RTSP（實時傳輸協議），修改後最好重啓一下攝像頭。

#再將以下代碼重新運行一下
import cv2
url = 'rtsp://admin:[email protected]:554/11'
cap = cv2.VideoCapture(url)
while(cap.isOpened()):  
    # 獲取一幀
    ret, frame = cap.read()  
    # 顯示結果幀  
    cv2.imshow('frame',frame)  
    if cv2.waitKey(1) & 0xFF == ord('q'):  
        break  
# 完成後，釋放捕獲 
cap.release()  
cv2.destroyAllWindows()

數據集的製作

目標檢測的數據類型包括2D的RGB圖像，2.5D的RGB-D圖像以及3D的點雲。
RGB圖像高像素的特徵，可以捕捉到更多的細節，但是缺乏3D信息，可以用比較成熟的CNN算法實現
RGBD圖像具有3D信息，相對稠密，但受傳感器影響大。可以結合相機內參轉換爲3D點雲，因此其既可以用CNN，也可以用基於點雲的DNN。
點雲具有精確的3D信息，但太過稀疏。其中點雲的表現形式，主要有體素化(voxelize)（用於訓練3D-CNN）、原始點雲(raw)（使用針對點雲的DNN，例如PointNet、PointCNN等）、前視圖(Front View)（對垂直空間進行劃分，得到多層）、鳥瞰圖(Bird Eye View, BEV)（使用傳統CNN）。

模型訓練

檢測demo

from object_detection.utils import visualization_utils as vis_util
from object_detection.utils import label_map_util
from distutils.version import StrictVersion
import tensorflow as tf
import numpy as np
import cv2

if StrictVersion(tf.__version__) < StrictVersion('1.9.0'):
    raise ImportError('Please upgrade your TensorFlow installation to v1.9.* or later!')    
# 開啓攝像頭
cap = cv2.VideoCapture(0)
# 添加模型位置和標籤配置文件位置
PATH_TO_FROZEN_GRAPH = ''
PATH_TO_LABELS = ''

# 載入模型
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

with detection_graph.as_default():
    with tf.Session(graph=detection_graph) as sess:
        while True:
            ret, image_np = cap.read()
            # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
            image_np_expanded = np.expand_dims(image_np, axis=0)
            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
            # Each box represents a part of the image where a particular object was detected.
            boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
            # Each score represent how level of confidence for each of the objects.
            # Score is shown on the result image, together with the class label.
            scores = detection_graph.get_tensor_by_name('detection_scores:0')
            classes = detection_graph.get_tensor_by_name('detection_classes:0')
            num_detections = detection_graph.get_tensor_by_name('num_detections:0')
            # Actual detection.
            (boxes, scores, classes, num_detections) = sess.run(
                [boxes, scores, classes, num_detections],
                feed_dict={image_tensor: image_np_expanded})
            # Visualization of the results of a detection.
            vis_util.visualize_boxes_and_labels_on_image_array(
                image_np, np.squeeze(boxes),
                np.squeeze(classes).astype(np.int32),
                np.squeeze(scores), category_index,
                use_normalized_coordinates=True,
                line_thickness=8)

            cv2.imshow('object detection', image_np)
            if cv2.waitKey(25) & 0xFF == ord('q'):
                cv2.destroyAllWindows()
                break
cap.release()
cv2.destroyAllWindows()

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

3D目標檢測的實現（CV+TF）

數據感知模塊

數據集的製作

模型訓練

檢測demo

3D目標檢測的實現（CV+TF）

初學flask之身份證上傳

2020算法面試總結（二）

diff/find解決版本差異

算法與基礎

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結