opencv---關於DNN的若干學習筆記

1.什麼是DNN？

DNN全稱deep neural network，深度神經網絡。是深度學習的基礎。

2.opencv中關於DNN的常用api。

（1）加載網絡模型的api

Net 
cv::dnn::readNet (const String &model, const String &config="", const String &framework="")
Net 
cv::dnn::readNetFromCaffe (const String &prototxt, const String &caffeModel=String())
Net 
cv::dnn::readNetFromTensorflow (const String &model, const String &config=String())
Net 
cv::dnn::readNetFromTorch (const String &model, bool isBinary=true, bool evaluate=true)
Net 
cv::dnn::readNetFromDarknet (const String &cfgFile, const String &darknetModel=String())

model二進制文件包含經過訓練的權重，對於來自不同框架的模型需要使用不同的文件拓展名。

config文本文件包含網絡配置，針對不同的框架也有不同的拓展名。

（1）*.caffemodel：caffe框架；*.prototxt;

（2）*.pb：tensorflow框架；*.pbtxt;

（3）*.weights：Darknet框架；*.cfg;

（4）*.t7：torch框架；

（5）*.bin：DLDT框架。*.xml.

（2）將輸入圖像轉換爲模型的標準輸入

Mat cv::dnn::blobFromImage (InputArray image, double scalefactor = 1.0, 
const Size & size = Size(), 
const Scalar & mean = Scalar(), 
bool swapRB = false, 
bool crop = false, 
int ddepth = CV_32F )

除了第一個參數表示輸入圖像，這個函數的其他輸入參數完全是由所選擇的網絡模型的參數決定。第二個參數表示對像素值進行縮放的比例；第三個參數表示對圖像進行均值處理所需要的均值大小，第四個參數表示R通道和B通道是否需要交換；第五個參數表示是否需要對圖像進行剪切。第六個參數表示輸出的深度。

（3）設置模型的輸入

void cv::dnn::Net::setInput (InputArray blob, const String & name = "", 
double scalefactor = 1.0, const Scalar & mean = Scalar() )

第一個參數模型的標準輸入數據，是blobFromImage函數處理的結果；第二個參數表示模型輸入層的名字，需要查找模型得到。

（4）設置模型的輸出

Mat cv::dnn::Net::forward (const String & outputName = String())

參數表示模型輸出層的名字，函數是爲模型選擇輸出層。輸出的結果則是一個四維的數據，前兩個的維度是一，第三個表示檢測到的box數量，第四個表示每個box的分類標籤、得分信息、座標位置。這裏每個box的座標均是浮點數的比率，若要顯示需要先轉換成像素值座標。

3.DNN解析網絡輸出結果

（1）如果對象檢測網絡是SSD/RCNN/Faster-RCNN，輸出的是N*7模式，所以其解析方式如下

Mat detectionMat(out.size[2],out.size[3],CV_32F,out.ptr<float>())

其中7表示七列輸出，第一列表示下標，第二列表示分類標籤，第三列表示置信度，第四列至第七列表示box的座標位置。

（2）如果對象檢測網絡是基於Region的YOLO網絡，則對象解析方式變爲

Mat scores=outs[i].row(j).colRange(5,outs[i].cols);

表示第i層輸出的第j行的5-cols列，均表示評分。前五個是cx，cy，w，h，置信度。

4.應用：基於SSD的目標檢測

代碼

#include<opencv.hpp>
#include<dnn.hpp>
#include<iostream>
using namespace cv;
using namespace cv::dnn;
using namespace std;

const size_t width = 300;//SSD模型的輸入大小是300*300*3
const size_t height = 300;
string label_file = "labelmap_det.txt";
string model_file = "MobileNetSSD_deploy.caffemodel";
string model_text_file = "MobileNetSSD_deploy.prototxt";
String objNames[] = { "background",
"aeroplane", "bicycle", "bird", "boat",
"bottle", "bus", "car", "cat", "chair",
"cow", "diningtable", "dog", "horse",
"motorbike", "person", "pottedplant",
"sheep", "sofa", "train", "tvmonitor" };

int main()
{
	Mat frame = imread("2.jpg");
	if (frame.empty())
	{
		cout << "could not read image..." << endl;
		return -1;
	}
	Net net = readNetFromCaffe(model_text_file, model_file);
	Mat blobImage = blobFromImage(frame, 0.007843, Size(width, height), Scalar(127.5, 127.5, 127.5), true, false);//將輸入圖像轉換爲模型的標準輸入
	cout << "blobImage width=" << blobImage.cols << ",height=" << blobImage.rows << endl;
	net.setInput(blobImage, "data");//設置模型輸入
	Mat detection = net.forward("detection_out");//設置模型輸出,輸出一共有4維,分別是：標籤、置信度、目標數量和目標的信息
	vector<double>layersTiming;//計算時間
	double freq = getTickFrequency() / 1000;
	double time = net.getPerfProfile(layersTiming) / freq;
	cout << "excute time:" << time << endl;
	
	Mat detectionMat(detection.size[2], detection.size[3], CV_32F, detection.ptr<float>());//將檢測到的目標轉成矩陣表示
	float confidence_threshold = 0.5;
	for (int i = 0; i < detectionMat.rows; i++)
	{
		float confidence = detectionMat.at<float>(i, 2);//矩陣的第三列表示目標的置信度
		if(confidence>confidence_threshold)
		{
			size_t objectIdx = (size_t)detectionMat.at<float>(i, 1);//第二列表示目標的分類標籤
			float tl_x = detectionMat.at<float>(i, 3)*frame.cols;//後面四列表示box的四個座標位置
			float tl_y = detectionMat.at<float>(i, 4)*frame.rows;//座標值是浮點數的比率,需要轉換成像素座標
			float br_x = detectionMat.at<float>(i, 5)*frame.cols;
			float br_y = detectionMat.at<float>(i, 6)*frame.rows;
			Rect object_box((int)tl_x, (int)tl_y, (int)br_x, (int)br_y);
			rectangle(frame, object_box, Scalar(0, 0, 255), 2, 8, 0);
			putText(frame, format("confidence %.2f,%s", confidence, objNames[objectIdx].c_str()), Point(tl_x - 10, tl_y - 5),
				FONT_HERSHEY_SIMPLEX, 0.7, Scalar(255, 0, 0), 2, 8);
			cout << "confidence:" << confidence << ",object name:" << objNames[objectIdx].c_str() << endl;
		}
	}
	imshow("frame", frame);
	waitKey(0);
	return 0;
}

結果

模型下載地址：https://github.com/gloomyfish1998/opencv_tutorial/tree/master/data/models/ssd

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

opencv---關於DNN的若干學習筆記

再談23種設計模式（3）：行爲型模式（學習筆記）

Power Automate Desktop 安裝完，登錄後老是提示one driver 錯誤

微前端學習筆記(4):從微前端到微模塊之EMP與hel-micro方案探索

微前端學習筆記（1）：微前端總體架構概述，從微服務發微

985 碩士程序員，空窗 4 個月沒有 Offer！

一文搞懂 Spring 循環依賴

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

VScode右鍵打開(添加到右鍵)

記一次 .NET某工控視覺自動化系統卡死分析

WindowsServer--SQL Server搭建主從同步實現讀寫分離 - 事務性分發

opencv---擊中擊不中

一個下載網頁視頻的方法

樸素貝葉斯----過濾垃圾郵件

opencv---關於DNN的若干學習筆記

c++遇到的警告

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結