【ONNX】使用yolov3.onnx模型進行目標識別的實驗

文章目錄

測試圖像結果

yolov3原理分析

關於模型原理分析，網上已有很多博客，不再贅述。下面是兩個我認爲寫的比較好的。
yolo系列之yolo v3【深度解析】
yolov3實驗總結

yolov3.onnx模型來源和介紹

來源

darknet—>caffe—>onnx
1.darknet轉caffe參考
2.caffe轉onnx用的是我前面寫的caffe2onnx工具。

介紹

模型輸入

本模型輸入爲416x416的圖像，輸入名爲input。

模型輸入爲：
name: "input"
type {
  tensor_type {
    elem_type: 1
    shape {
      dim {
        dim_value: 1
      }
      dim {
        dim_value: 3
      }
      dim {
        dim_value: 416
      }
      dim {
        dim_value: 416
      }
    }
  }
}

模型輸出

本模型輸出爲三個feature map，維度分別是255x13x13，255x26x26，255x52x52，其中255=3 x (80 + 5)，80個類的概率加 $t_x,t_y,t_w,t_h,t_o$ (置信度)。

模型輸出爲：
name: "layer82-conv_Y"
type {
  tensor_type {
    elem_type: 1
    shape {
      dim {
        dim_value: 1
      }
      dim {
        dim_value: 255
      }
      dim {
        dim_value: 13
      }
      dim {
        dim_value: 13
      }
    }
  }
}

name: "layer94-conv_Y"
type {
  tensor_type {
    elem_type: 1
    shape {
      dim {
        dim_value: 1
      }
      dim {
        dim_value: 255
      }
      dim {
        dim_value: 26
      }
      dim {
        dim_value: 26
      }
    }
  }
}

name: "layer106-conv_Y"
type {
  tensor_type {
    elem_type: 1
    shape {
      dim {
        dim_value: 1
      }
      dim {
        dim_value: 255
      }
      dim {
        dim_value: 52
      }
      dim {
        dim_value: 52
      }
    }
  }
}

一共有3個輸出

節點類型種類

各類型節點數爲：
LeakyRelu:72個
BatchNormalization:72個
Conv:75個
Upsample:2個
Concat:4個
Add:23個

依賴庫

onnxruntime
numpy
cv2

思路

主體流程如下圖：

其中獲取bounding boxes的過程如下圖：

代碼

準備工作

導入庫並設置好標籤和anchors，由於只使用了numpy，因此自己實現一個sigmoid函數。

import onnxruntime
import numpy as np
import cv2
label = ["background", "person",
        "bicycle", "car", "motorbike", "aeroplane",
        "bus", "train", "truck", "boat", "traffic light",
        "fire hydrant", "stop sign", "parking meter", "bench",
        "bird", "cat", "dog", "horse", "sheep", "cow", "elephant",
        "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag",
        "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball",
        "kite", "baseball bat", "baseball glove", "skateboard", "surfboard",
        "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon",
        "bowl", "banana", "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog",
        "pizza", "donut", "cake", "chair", "sofa", "potted plant", "bed", "dining table",
        "toilet", "TV monitor", "laptop", "mouse", "remote", "keyboard", "cell phone",
        "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase",
        "scissors", "teddy bear", "hair drier", "toothbrush"]
anchors = [[(116,90),(156,198),(373,326)],[(30,61),(62,45),(59,119)],[(10,13),(16,30),(33,23)]]

def sigmoid(x):
    s = 1 / (1 + np.exp(-1*x))
    return s

處理圖像

def process_image(img_path):
    img = cv2.imread(img_path)
    img = cv2.resize(img, (416, 416))
    image =  img[:,:,::-1].transpose((2,0,1))
    image = image[np.newaxis,:,:,:]/255
    image = np.array(image,dtype=np.float32)
    #返回原圖像和處理後的數組
    return img,image

獲取概率最大的概率值和索引

def getMaxClassScore(class_scores):
    class_score = 0
    class_index = 0
    for i in range(len(class_scores)):
        if class_scores[i] > class_score:
            class_index = i+1
            class_score = class_scores[i]
    return class_score,class_index

獲取bbox+第一次篩選(目標置信度閾值)

對feature map的每一個grid cell獲取三個對應anchors的bbox( $b_x,b_y,b_w,b_h,b_{class\_scores},b_{class\_index}$ )，並根據目標置信度閾值進行篩選。

def getBBox(feat,anchors,image_shape,confidence_threshold):
    box = []
    for i in range(len(anchors)):
        for cx in range(feat.shape[0]):
            for cy in range(feat.shape[1]):
                tx = feat[cx][cy][0 + 85 * i]
                ty = feat[cx][cy][1 + 85 * i]
                tw = feat[cx][cy][2 + 85 * i]
                th = feat[cx][cy][3 + 85 * i]
                cf = feat[cx][cy][4 + 85 * i]
                cp = feat[cx][cy][5 + 85 * i:85 + 85 * i]

                bx = (sigmoid(tx) + cx)/feat.shape[0]
                by = (sigmoid(ty) + cy)/feat.shape[1]
                bw = anchors[i][0]*np.exp(tw)/image_shape[0]
                bh = anchors[i][1]*np.exp(th)/image_shape[1]

                b_confidence = sigmoid(cf)
                b_class_prob = sigmoid(cp)
                b_scores = b_confidence*b_class_prob
                b_class_score,b_class_index = getMaxClassScore(b_scores)

                if b_class_score > confidence_threshold:
                    box.append([bx,by,bw,bh,b_class_score,b_class_index])
    return box

第二次篩選(NMS非極大值抑制)

NMS原理和實現參考

def donms(boxes,nms_threshold):
    b_x = boxes[:, 0]
    b_y = boxes[:, 1]
    b_w = boxes[:, 2]
    b_h = boxes[:, 3]
    scores = boxes[:,4]
    areas = (b_w+1)*(b_h+1)
    order = scores.argsort()[::-1]
    keep = []  # 保留的結果框集合
    while order.size > 0:
        i = order[0]
        keep.append(i)  # 保留該類剩餘box中得分最高的一個
        # 得到相交區域,左上及右下
        xx1 = np.maximum(b_x[i], b_x[order[1:]])
        yy1 = np.maximum(b_y[i], b_y[order[1:]])
        xx2 = np.minimum(b_x[i] + b_w[i], b_x[order[1:]] + b_w[order[1:]])
        yy2 = np.minimum(b_y[i] + b_h[i], b_y[order[1:]] + b_h[order[1:]])
        #相交面積,不重疊時面積爲0
        w = np.maximum(0.0, xx2 - xx1 + 1)
        h = np.maximum(0.0, yy2 - yy1 + 1)
        inter = w * h
        #相併面積,面積1+面積2-相交面積
        union = areas[i] + areas[order[1:]] - inter
        # 計算IoU：交 /（面積1+面積2-交）
        IoU = inter / union
        # 保留IoU小於閾值的box
        inds = np.where(IoU <= nms_threshold)[0]
        order = order[inds + 1]  # 因爲IoU數組的長度比order數組少一個,所以這裏要將所有下標後移一位

    final_boxes = [boxes[i] for i in keep]
    return final_boxes

繪製預測框

def drawBox(boxes,img):
    for box in boxes:
        x1 = int((box[0]-box[2]/2)*416)
        y1 = int((box[1]-box[3]/2)*416)
        x2 = int((box[0]+box[2]/2)*416)
        y2 = int((box[1]+box[3]/2)*416)
        cv2.rectangle(img,(x1,y1),(x2,y2),(0,255,0),2)
        cv2.putText(img, label[int(box[5])]+":"+str(round(box[4],3)), (x1+5,y1+10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 1)
    cv2.imshow('image',img)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

總流程

def getBoxes(prediction,confidence_threshold,nms_threshold):
    boxes = []
    for i in range(len(prediction)):
        feature_map = prediction[i][0].transpose((2, 1, 0))
        box = getBBox(feature_map, anchors[i], [416, 416], confidence_threshold)
        boxes.extend(box)
    Boxes = donms(np.array(boxes),nms_threshold)
    return Boxes

def main():
    img,TestData = process_image("dog416.jpg")
    session = onnxruntime.InferenceSession("yolov3.onnx")
    inname = [input.name for input in session.get_inputs()][0]
    outname = [output.name for output in session.get_outputs()]

    print("inputs name:",inname,"outputs name:",outname)
    prediction = session.run(outname, {inname:TestData})
    boxes = getBoxes(prediction,0.25,0.6)
    drawBox(boxes,img)

【ONNX】使用yolov3.onnx模型進行目標識別的實驗

文章目錄

yolov3原理分析

yolov3.onnx模型來源和介紹

來源

介紹

模型輸入

模型輸出

節點類型種類

依賴庫

思路

代碼

準備工作

處理圖像

獲取概率最大的概率值和索引

獲取bbox+第一次篩選(目標置信度閾值)

第二次篩選(NMS非極大值抑制)

繪製預測框

總流程

測試圖像結果

如何使用 JS 判斷用戶是否處於活躍狀態

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

❤️‍🔥 Solon Cloud Event 新的事務特性與應用

【數據】讀取mnist數據集

【ONNX】使用yolov3.onnx模型進行目標識別的實驗

【caffe】配置caffe記錄(GPU)[2018.11.07更新]

【caffe】Caffe模型轉換爲ONNX模型(新版)

【機器學習】決策樹(二)----CART算法

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結