（OpenCV）HOG源碼分析

原文地址：http://www.cnblogs.com/tornadomeet/archive/2012/08/15/2640754.html

一、網上一些參考資料

　　在博客目標檢測學習_1(用opencv自帶hog實現行人檢測) 中已經使用了opencv自帶的函數detectMultiScale()實現了對行人的檢測，當然了，該算法採用的是hog算法，那麼hog算法是怎樣實現的呢？這一節就來簡單分析一下opencv中自帶 hog源碼。

　　網上也有不少網友對opencv中的hog源碼進行了分析，很不錯，看了很有收穫。比如：

　　　　http://blog.csdn.net/raocong2010/article/details/6239431

　　　　該博客對該hog算法中用到的block，cell等概念有一定的圖標解釋；

　　　　http://blog.csdn.net/pp5576155/article/details/7029699

　　　　該博客是轉載的，裏面有opencv源碼的一些註釋，很有幫助。

　　　　http://gz-ricky.blogbus.com/logs/85326280.html

　　　　本博客對hog描述算子長度的計算做了一定介紹。

　　　　http://hi.baidu.com/susongzhi/item/3a3c758d7ff5cbdc5e0ec172

　　　　該博客對hog中快速算法的三線插值將得很詳細。

　　　　http://blog.youtueye.com/work/opencv-hog-peopledetector-trainning.html

　　　　這篇博客對hog怎樣訓練和檢測做了一定的講解。

二、關於源碼的一些簡單說明

本文不是講解hog理論的，所以需要對hog算法有一定了解，這些可以去參考hog提出者的博士論文，寫得很詳細。

按照正常流程，hog行人檢測分爲訓練過程和檢測過程，訓練過程主要是訓練得到svm的係數。在opencv源碼中直接採用訓練好了的svm係數，所以訓練過程源碼中沒有涉及到多少。

　　　首先還是對hog源碼中一些固定參數來個簡單說明：

　　　檢測窗口大小爲128*64;

　　　Block大小爲16*16；

　　　Cell大小爲8*8；

　　　Block在檢測窗口中上下移動尺寸爲8*8；

　　　1個cell的梯度直方圖化成9個bin；

　　　滑動窗口在檢測圖片中滑動的尺寸爲8*8；

　　　代碼中的一個hog描述子是針對一個檢測窗口而言的，所以一個檢測窗口共有105=((128-16)/8+1)*((64-16)/8+1)個block；一個block中有4個cell，而一個cell的hog描述子向量的長度爲9；所以檢測窗口的hog向量長度=3780=105*4*9維。

三、hog訓練部分流程的簡單理解

雖然hog源碼中很少涉及到訓練部分的代碼，不過了解下訓練過程的流程會對整個檢測過程有個整體認識。

訓練過程中正樣本大小統一爲128*64,即檢測窗口的大小；該樣本圖片可以包含1個或多個行人。對該圖片提前的hog特徵長度剛好爲3780維，每一個特徵對應一個正樣本標籤進行訓練。在實際的訓練過程中，我們並不是去google上收集或者拍攝剛好128*64大小且有行人的圖片，而是收集包含行人的任意圖片(當然了,尺寸最好比128*64大),然後手工對這些正樣本進行標註，即對有行人的地方畫個矩形，其實也就是存了2個頂點的座標而已，並把這個矩形的信息存儲起來；最好自己寫一個程序，每讀入一張圖片，就把矩形區域的內容截取出來並縮放到統一尺寸128*64，這樣，對處理過後的該圖片進行hog特徵提取就可以當做正樣本了。

負樣本不需要統一尺寸，只需比128*64大，且圖片中不能包含任何行人。實際過程中，由於是負樣本，裏面沒有目標信息，所以不需要人工進行標註。程序中可以對該圖片隨機進行截取128*64大小的圖片，並提取出其hog特徵作爲負樣本。

四、hog行人檢測過程

檢測過程中採用的是滑動窗口法，對應本代碼中，滑動窗口法的流程如下：

由上圖可以看出，檢測時，會對輸入圖片進行尺度縮放(一般是縮小),在每一層的圖像上採用固定大小的滑動窗口(128*64)滑動，沒個滑動窗口都提取出hog特徵，送入到svm分類器中，看該窗口中是否有目標。有則存下目標區域來，無則繼續滑動。

檢測過程中用到的函數爲detectMultiScale(),其參數分配圖如下：

五、計算檢測窗口中圖像的梯度

計算梯度前如果需要gamma校正的話就先進行gamma校正，所謂的gamma校正就是把原來的每個通道像素值範圍從0~255變換到0~15.97(255開根號)。據作者說這樣校正過後的圖像計算的效果會更好，在計算梯度前不需要進行高斯濾波操作。

梯度的計算是分別計算水平梯度圖和垂直梯度圖，然後求幅值和相位。水平梯度卷積算子爲：

　　　　垂直梯度卷積算子爲：

在閱讀該源碼的時候，要特別注意梯度幅值和角度的存儲方式。因爲是對一個滑動窗口裏的圖像進行的，所以梯度幅值和角度按照道理來說應該都是128*64=8192維的向量。但實際過程中這2者都是用的128*64*2=16384維的向量。爲什麼呢？

因爲這裏的梯度和角度都是用到了二線插值的。每一個點的梯度角度可能是0~180度之間的任意值，而程序中將其離散化爲9個bin，即每個bin佔20度。所以滑動窗口中每個像素點的梯度角度如果要離散化到這9個bin中，則一般它都會有2個相鄰的bin(如果恰好位於某個bin的中心，則可認爲對該bin的權重爲1即可)。從源碼中可以看到梯度的幅值是用來計算梯度直方圖時權重投票的，所以每個像素點的梯度幅值就分解到了其角度相鄰的2個bin了，越近的那個bin得到的權重越大。因此幅度圖像用了2個通道，每個通道都是原像素點幅度的一個分量。同理，不難理解，像素點的梯度角度也用了2個通道，每個通道中存儲的是它相鄰2個bin的bin序號。序號小的放在第一通道。

二線插值的示意圖如下：

其中，假設那3條半徑爲離散化後bin的中心，紅色虛線爲像素點O(像素點在圓心處)的梯度方向，梯度幅值爲A，該梯度方向與最近的相鄰bin爲bin0,這兩者之間的夾角爲a.這該像素點O處存儲的梯度幅值第1通道爲A*(1-a),第2通道爲A*a;該像素點O處存儲的角度第1通道爲0(bin的序號爲0)，第2通道爲1(bin的序號爲1)。

另外在計算圖像的梯度圖和相位圖時，如果該圖像時3通道的，則3通道分別取梯度值，並且取梯度最大的那個通道的值爲該點的梯度幅值。

六、HOG緩存結構體

HOG緩存思想是該程序作者加快hog算法速度採用的一種內存優化技術。由於我們對每幅輸入圖片要進行4層掃描，分別爲圖像金字塔層，每層中滑動窗口，每個滑動窗口中滑動的block，每個block中的cell，其實還有每個cell中的像素點；有這麼多層，每一層又是一個二維的，所以速度非常慢。作者的採用的思想是HOG緩存，即把計算得到的每個滑動窗口的數據(其實最終是每個block的hog描述子向量)都存在內存查找表中，由於滑動窗口在滑動時，很多個block都會重疊，因此重疊處計算過的block信息就可以直接從查找表中讀取，這樣就節省了很多時間。

在這個HOG存儲結構體中，會計算滑動窗口內的hog描述子，而這又涉及到滑動窗口，block，cell直接的關係，其之間的關係可以參考下面示意圖：

外面最大的爲待檢測的圖片，對待檢測的圖片需要用滑動窗口進行滑動來判斷窗口中是否有目標，每個滑動窗口中又有很多個重疊移動的block，每個block中還有不重疊的cell。其實該程序的作者又將每個block中的像素點對cell的貢獻不同，有將每個cell分成了4個區域，即圖中藍色虛線最小的框。

那麼block中不同的像素點對它的cell(默認參數爲1個block有4個cell)的影響是怎樣的呢？請看下面示意圖。

如果所示，黑色框代表1個block，紅實線隔開的爲4個cell，每個cell用綠色虛線隔開的我們稱之爲4個區域，所以該block中共有16個區域，分別爲A、B、C、…、O、P。

程序中將這16個區域分爲4組：

第1組：A、D、M、P;該組內的像素點計算梯度方向直方圖時只對其所在的cell有貢獻。

第2組：B、C、N、O;該組內的像素點計算梯度直方圖時對其所在的左右cell有貢獻。

第3組：E、I、H、L;該組內的像素點計算梯度直方圖時對其所在的上下cell有貢獻。

第4組：F、G、J、K;該組內的像素點對其上下左右的cell計算梯度直方圖時都有貢獻。

那到底是怎麼對cell貢獻的呢？舉個例子來說，E區域內的像素點對cell0和cell2有貢獻。本來1個block對滑動窗口貢獻的向量維數爲36維，即每個cell貢獻9維，其順序分別爲cell0,cell1,cell2,cell3.而E區域內的像素由於同時對cell0和cell2有貢獻，所以在計算E區域內的像素梯度投票時，不僅要投向它本來的cell0，還要投向下面的cell2，即投向cell0和cell2有一個權重，該權重與該像素點所在位置與cell0，cell2中心位置的距離有關。具體的關係可以去查看源碼。

該結構體變量內存分配圖如下，可以增強讀代碼的直觀性：

在讀該部分源碼時，需要特別注意以下幾個地方：

　　　　1) 結構體BlockData中有2個變量。1個BlockData結構體是對應的一個block數據。histOfs和imgOffset.其中histOfs表示爲該block對整個滑動窗口內hog描述算子的貢獻那部分向量的起始位置；imgOffset爲該block在滑動窗口圖片中的座標(當然是指左上角座標)。

　　　　2) 結構體PixData中有5個變量，1個PixData結構體是對應的block中1個像素點的數據。其中gradOfs表示該點的梯度幅度在滑動窗口圖片梯度幅度圖中的位置座標；qangleOfs表示該點的梯度角度在滑動窗口圖片梯度角度圖中的位置座標；histOfs[]表示該像素點對1個或2個或4個cell貢獻的hog描述子向量的起始位置座標（比較抽象，需要看源碼才懂）。histWeight[]表示該像素點對1個或2個或4個cell貢獻的權重。gradWeight表示該點本身由於處在block中位置的不同因而對梯度直方圖貢獻也不同，其權值按照二維高斯分佈(以block中心爲二維高斯的中心)來決定。

　　　　3) 程序中的count1,cout2,cout4分別表示該block中對1個cell、2個cell、4個cell有貢獻的像素點的個數。

　　　　七、其他一些函數

　　　　該程序中還有一些其它的函數。

　　　　getblock()表示的是給定block在滑動窗口的位置以及圖片的hog緩存指針，來獲得本次block中計算hog特徵所需要的信息。

　　　　normalizeBlockHistogram()指對block獲取到的hog部分描述子進行歸一化，其實該歸一化有2層，具體看代碼。

　　　　windowsInImage()實現的功能是給定測試圖片和滑動窗口移動的大小，來獲得該層中水平和垂直方向上需要滑動多少個滑動窗口。

　　　　getWindow()值獲得一個滑動窗口矩形。

　　　　compute()是實際上計算hog描述子的函數，在測試和訓練階段都能用到。

　　　　detect()是檢測目標是用到的函數，在detectMultiScale()函數內部被調用。

八、關於HOG的初始化

Hog初始化可以採用直接賦初值；也直接從文件節點中讀取(有相應的格式，好像採用的是xml文件格式)；當然我們可以讀取初始值，也可以在程序中設置hog算子的初始值並寫入文件，這些工作可以採用源碼中的read，write，load，save等函數來完成。

九、hog源碼的註釋

在讀源碼時，由於裏面用到了intel的ipp庫，優化了算法的速度，所以在程序中遇到#ifdef HAVE_IPP後面的代碼時，可以直接跳過不讀，直接讀#else後面的代碼，這並不影響對原hog算法的理解。

首先來看看hog源碼中用到的頭文件目錄圖，如下：

　　　　下面是我對hog源碼的一些註釋，由於本人接觸c++比較少，可能有些c++的語法常識也給註釋起來了，還望大家能理解。另外程序中還有一些細節沒有讀懂，或者說是註釋錯了的，大家可以一起來討論下,很多細節要在源碼中才能看懂。

hog.cpp:

   1 /*M///////////////////////////////////////////////////////////////////////////////////////
   2 //
   3 //  IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
   4 //
   5 //  By downloading, copying, installing or using the software you agree to this license.
   6 //  If you do not agree to this license, do not download, install,
   7 //  copy or use the software.
   8 //
   9 //
  10 //                           License Agreement
  11 //                For Open Source Computer Vision Library
  12 //
  13 // Copyright (C) 2000-2008, Intel Corporation, all rights reserved.
  14 // Copyright (C) 2009, Willow Garage Inc., all rights reserved.
  15 // Third party copyrights are property of their respective owners.
  16 //
  17 // Redistribution and use in source and binary forms, with or without modification,
  18 // are permitted provided that the following conditions are met:
  19 //
  20 //   * Redistribution's of source code must retain the above copyright notice,
  21 //     this list of conditions and the following disclaimer.
  22 //
  23 //   * Redistribution's in binary form must reproduce the above copyright notice,
  24 //     this list of conditions and the following disclaimer in the documentation
  25 //     and/or other materials provided with the distribution.
  26 //
  27 //   * The name of the copyright holders may not be used to endorse or promote products
  28 //     derived from this software without specific prior written permission.
  29 //
  30 // This software is provided by the copyright holders and contributors "as is" and
  31 // any express or implied warranties, including, but not limited to, the implied
  32 // warranties of merchantability and fitness for a particular purpose are disclaimed.
  33 // In no event shall the Intel Corporation or contributors be liable for any direct,
  34 // indirect, incidental, special, exemplary, or consequential damages
  35 // (including, but not limited to, procurement of substitute goods or services;
  36 // loss of use, data, or profits; or business interruption) however caused
  37 // and on any theory of liability, whether in contract, strict liability,
  38 // or tort (including negligence or otherwise) arising in any way out of
  39 // the use of this software, even if advised of the possibility of such damage.
  40 //
  41 //M*/
  42 
  43 #include "precomp.hpp"
  44 #include <iterator>
  45 #ifdef HAVE_IPP
  46 #include "ipp.h"
  47 #endif
  48 /****************************************************************************************\
  49       The code below is implementation of HOG (Histogram-of-Oriented Gradients)
  50       descriptor and object detection, introduced by Navneet Dalal and Bill Triggs.
  51 
  52       The computed feature vectors are compatible with the
  53       INRIA Object Detection and Localization Toolkit
  54       (http://pascal.inrialpes.fr/soft/olt/)
  55 \****************************************************************************************/
  56 
  57 namespace cv
  58 {
  59 
  60 size_t HOGDescriptor::getDescriptorSize() const
  61 {
  62     //下面2個語句是保證block中有整數個cell;保證block在窗口中能移動整數次
  63     CV_Assert(blockSize.width % cellSize.width == 0 &&
  64         blockSize.height % cellSize.height == 0);
  65     CV_Assert((winSize.width - blockSize.width) % blockStride.width == 0 &&
  66         (winSize.height - blockSize.height) % blockStride.height == 0 );
  67     //返回的nbins是每個窗口中檢測到的hog向量的維數
  68     return (size_t)nbins*
  69         (blockSize.width/cellSize.width)*
  70         (blockSize.height/cellSize.height)*
  71         ((winSize.width - blockSize.width)/blockStride.width + 1)*
  72         ((winSize.height - blockSize.height)/blockStride.height + 1);
  73 }
  74 
  75 //winSigma到底是什麼作用呢？
  76 double HOGDescriptor::getWinSigma() const
  77 {
  78     return winSigma >= 0 ? winSigma : (blockSize.width + blockSize.height)/8.;
  79 }
  80 
  81 //svmDetector是HOGDescriptor內的一個成員變量，數據類型爲向量vector。
  82 //用來保存hog特徵用於svm分類時的係數的.
  83 //該函數返回爲真的實際含義是什麼呢？保證與hog特徵長度相同，或者相差1，但爲什麼
  84 //相差1也可以呢？
  85 bool HOGDescriptor::checkDetectorSize() const
  86 {
  87     size_t detectorSize = svmDetector.size(), descriptorSize = getDescriptorSize();
  88     return detectorSize == 0 ||
  89         detectorSize == descriptorSize ||
  90         detectorSize == descriptorSize + 1;
  91 }
  92 
  93 void HOGDescriptor::setSVMDetector(InputArray _svmDetector)
  94 {  
  95     //這裏的convertTo函數只是將圖像Mat屬性更改，比如說通道數，矩陣深度等。
  96     //這裏是將輸入的svm係數矩陣全部轉換成浮點型。
  97     _svmDetector.getMat().convertTo(svmDetector, CV_32F);
  98     CV_Assert( checkDetectorSize() );
  99 }
 100 
 101 #define CV_TYPE_NAME_HOG_DESCRIPTOR "opencv-object-detector-hog"
 102 
 103 //FileNode是opencv的core中的一個文件存儲節點類，這個節點用來存儲讀取到的每一個文件元素。
 104 //一般是讀取XML和YAML格式的文件
 105 //又因爲該函數是把文件節點中的內容讀取到其類的成員變量中，所以函數後面不能有關鍵字const
 106 bool HOGDescriptor::read(FileNode& obj)
 107 {
 108     //isMap()是用來判斷這個節點是不是一個映射類型，如果是映射類型，則每個節點都與
 109     //一個名字對應起來。因此這裏的if語句的作用就是需讀取的文件node是一個映射類型
 110     if( !obj.isMap() )
 111         return false;
 112     //中括號中的"winSize"是指返回名爲winSize的一個節點，因爲已經知道這些節點是mapping類型
 113     //也就是說都有一個對應的名字。
 114     FileNodeIterator it = obj["winSize"].begin();
 115     //操作符>>爲從節點中讀入數據，這裏是將it指向的節點數據依次讀入winSize.width,winSize.height
 116     //下面的幾條語句功能類似
 117     it >> winSize.width >> winSize.height;
 118     it = obj["blockSize"].begin();
 119     it >> blockSize.width >> blockSize.height;
 120     it = obj["blockStride"].begin();
 121     it >> blockStride.width >> blockStride.height;
 122     it = obj["cellSize"].begin();
 123     it >> cellSize.width >> cellSize.height;
 124     obj["nbins"] >> nbins;
 125     obj["derivAperture"] >> derivAperture;
 126     obj["winSigma"] >> winSigma;
 127     obj["histogramNormType"] >> histogramNormType;
 128     obj["L2HysThreshold"] >> L2HysThreshold;
 129     obj["gammaCorrection"] >> gammaCorrection;
 130     obj["nlevels"] >> nlevels;
 131     
 132     //isSeq()是判斷該節點內容是不是一個序列
 133     FileNode vecNode = obj["SVMDetector"];
 134     if( vecNode.isSeq() )
 135     {
 136         vecNode >> svmDetector;
 137         CV_Assert(checkDetectorSize());
 138     }
 139     //上面的都讀取完了後就返回讀取成功標誌
 140     return true;
 141 }
 142     
 143 void HOGDescriptor::write(FileStorage& fs, const String& objName) const
 144 {
 145     //將objName名字輸入到文件fs中
 146     if( !objName.empty() )
 147         fs << objName;
 148 
 149     fs << "{" CV_TYPE_NAME_HOG_DESCRIPTOR
 150     //下面幾句依次將hog描述子內的變量輸入到文件fs中，且每次輸入前都輸入
 151     //一個名字與其對應，因此這些節點是mapping類型。
 152     << "winSize" << winSize
 153     << "blockSize" << blockSize
 154     << "blockStride" << blockStride
 155     << "cellSize" << cellSize
 156     << "nbins" << nbins
 157     << "derivAperture" << derivAperture
 158     << "winSigma" << getWinSigma()
 159     << "histogramNormType" << histogramNormType
 160     << "L2HysThreshold" << L2HysThreshold
 161     << "gammaCorrection" << gammaCorrection
 162     << "nlevels" << nlevels;
 163     if( !svmDetector.empty() )
 164         //svmDetector則是直接輸入序列，也有對應的名字。
 165         fs << "SVMDetector" << "[:" << svmDetector << "]";
 166     fs << "}";
 167 }
 168 
 169 //從給定的文件中讀取參數
 170 bool HOGDescriptor::load(const String& filename, const String& objname)
 171 {
 172     FileStorage fs(filename, FileStorage::READ);
 173     //一個文件節點有很多葉子，所以一個文件節點包含了很多內容，這裏當然是包含的
 174     //HOGDescriptor需要的各種參數了。
 175     FileNode obj = !objname.empty() ? fs[objname] : fs.getFirstTopLevelNode();
 176     return read(obj);
 177 }
 178 
 179 //將類中的參數以文件節點的形式寫入文件中。
 180 void HOGDescriptor::save(const String& filename, const String& objName) const
 181 {
 182     FileStorage fs(filename, FileStorage::WRITE);
 183     write(fs, !objName.empty() ? objName : FileStorage::getDefaultObjectName(filename));
 184 }
 185 
 186 //複製HOG描述子到c中
 187 void HOGDescriptor::copyTo(HOGDescriptor& c) const
 188 {
 189     c.winSize = winSize;
 190     c.blockSize = blockSize;
 191     c.blockStride = blockStride;
 192     c.cellSize = cellSize;
 193     c.nbins = nbins;
 194     c.derivAperture = derivAperture;
 195     c.winSigma = winSigma;
 196     c.histogramNormType = histogramNormType;
 197     c.L2HysThreshold = L2HysThreshold;
 198     c.gammaCorrection = gammaCorrection;
 199     //vector類型也可以用等號賦值
 200     c.svmDetector = svmDetector; c.nlevels = nlevels; } 
 201 
 202 //計算圖像img的梯度幅度圖像grad和梯度方向圖像qangle.
 203 //paddingTL爲需要在原圖像img左上角擴增的尺寸，同理paddingBR
 204 //爲需要在img圖像右下角擴增的尺寸。
 205 void HOGDescriptor::computeGradient(const Mat& img, Mat& grad, Mat& qangle,
 206                                     Size paddingTL, Size paddingBR) const
 207 {
 208     //該函數只能計算8位整型深度的單通道或者3通道圖像.
 209     CV_Assert( img.type() == CV_8U || img.type() == CV_8UC3 );
 210 
 211     //將圖像按照輸入參數進行擴充,這裏不是爲了計算邊緣梯度而做的擴充，因爲
 212     //爲了邊緣梯度而擴充是在後面的代碼完成的，所以這裏爲什麼擴充暫時還不明白。
 213     Size gradsize(img.cols + paddingTL.width + paddingBR.width,
 214                   img.rows + paddingTL.height + paddingBR.height);
 215     grad.create(gradsize, CV_32FC2);  // <magnitude*(1-alpha), magnitude*alpha>
 216     qangle.create(gradsize, CV_8UC2); // [0..nbins-1] - quantized gradient orientation
 217     Size wholeSize;
 218     Point roiofs;
 219     //locateROI在此處是如果img圖像是從其它父圖像中某一部分得來的，那麼其父圖像
 220     //的大小尺寸就爲wholeSize了，img圖像左上角相對於父圖像的位置點就爲roiofs了。
 221     //對於正樣本，其父圖像就是img了，所以這裏的wholeSize就和img.size()是一樣的，
 222     //對應負樣本，這2者不同；因爲裏面的關係比較不好懂，這裏權且將wholesSize理解爲
 223     //img的size，所以roiofs就應當理解爲Point(0, 0)了。
 224     img.locateROI(wholeSize, roiofs);
 225 
 226     int i, x, y;
 227     int cn = img.channels();
 228 
 229     //_lut爲行向量，用來作爲浮點像素值的存儲查找表
 230     Mat_<float> _lut(1, 256);
 231     const float* lut = &_lut(0,0);
 232 
 233     //gamma校正指的是將0～256的像素值全部開根號，即範圍縮小了，且變換範圍都不成線性了，
 234     if( gammaCorrection )
 235         for( i = 0; i < 256; i++ )
 236             _lut(0,i) = std::sqrt((float)i);
 237     else
 238         for( i = 0; i < 256; i++ )
 239             _lut(0,i) = (float)i;
 240 
 241     //創建長度爲gradsize.width+gradsize.height+4的整型buffer
 242     AutoBuffer<int> mapbuf(gradsize.width + gradsize.height + 4);
 243     int* xmap = (int*)mapbuf + 1;
 244     int* ymap = xmap + gradsize.width + 2; 
 245 
 246     //言外之意思borderType就等於4了，因爲opencv的源碼中是如下定義的。
 247     //#define IPL_BORDER_REFLECT_101    4
 248     //enum{...,BORDER_REFLECT_101=IPL_BORDER_REFLECT_101,...}
 249     //borderType爲邊界擴充後所填充像素點的方式。   
 250     /*
 251     Various border types, image boundaries are denoted with '|'
 252 
 253     * BORDER_REPLICATE:     aaaaaa|abcdefgh|hhhhhhh
 254     * BORDER_REFLECT:       fedcba|abcdefgh|hgfedcb
 255     * BORDER_REFLECT_101:   gfedcb|abcdefgh|gfedcba
 256     * BORDER_WRAP:          cdefgh|abcdefgh|abcdefg        
 257     * BORDER_CONSTANT:      iiiiii|abcdefgh|iiiiiii  with some specified 'i'
 258    */
 259     const int borderType = (int)BORDER_REFLECT_101;
 260 
 261     for( x = -1; x < gradsize.width + 1; x++ )
 262     /*int borderInterpolate(int p, int len, int borderType)
 263       其中參數p表示的是擴充後圖像的一個座標，相對於對應的座標軸而言；
 264       len參數表示對應源圖像的一個座標軸的長度；borderType爲擴充類型，
 265       在上面已經有過介紹.
 266       所以這個函數的作用是從擴充後的像素點座標推斷出源圖像中對應該點
 267       的座標值。
 268    */
 269     //這裏的xmap和ymap實際含義是什麼呢？其實xmap向量裏面存的就是
 270     //擴充後圖像第一行像素點對應與原圖像img中的像素橫座標，可以看
 271         //出，xmap向量中有些元素的值是相同的，因爲擴充圖像肯定會對應
 272         //到原圖像img中的某一位置，而img本身尺寸內的像素也會對應該位置。
 273         //同理，ymap向量裏面存的是擴充後圖像第一列像素點對應於原圖想img
 274         //中的像素縱座標。
 275         xmap[x] = borderInterpolate(x - paddingTL.width + roiofs.x,
 276                         wholeSize.width, borderType) - roiofs.x;
 277     for( y = -1; y < gradsize.height + 1; y++ )
 278         ymap[y] = borderInterpolate(y - paddingTL.height + roiofs.y,
 279                         wholeSize.height, borderType) - roiofs.y;
 280 
 281     // x- & y- derivatives for the whole row
 282     int width = gradsize.width;
 283     AutoBuffer<float> _dbuf(width*4);
 284     float* dbuf = _dbuf;
 285     //DX爲水平梯度圖，DY爲垂直梯度圖，Mag爲梯度幅度圖，Angle爲梯度角度圖
 286     //該構造方法的第4個參數表示矩陣Mat的數據在內存中存放的位置。由此可以
 287     //看出，這4幅圖像在內存中是連續存儲的。
 288     Mat Dx(1, width, CV_32F, dbuf);
 289     Mat Dy(1, width, CV_32F, dbuf + width);
 290     Mat Mag(1, width, CV_32F, dbuf + width*2);
 291     Mat Angle(1, width, CV_32F, dbuf + width*3);
 292 
 293     int _nbins = nbins;
 294     //angleScale==9/pi;
 295     float angleScale = (float)(_nbins/CV_PI);
 296 #ifdef HAVE_IPP
 297     Mat lutimg(img.rows,img.cols,CV_MAKETYPE(CV_32F,cn));
 298     Mat hidxs(1, width, CV_32F);
 299     Ipp32f* pHidxs  = (Ipp32f*)hidxs.data;
 300     Ipp32f* pAngles = (Ipp32f*)Angle.data;
 301 
 302     IppiSize roiSize;
 303     roiSize.width = img.cols;
 304     roiSize.height = img.rows;
 305 
 306     for( y = 0; y < roiSize.height; y++ )
 307     {
 308        const uchar* imgPtr = img.data + y*img.step;
 309        float* imglutPtr = (float*)(lutimg.data + y*lutimg.step);
 310 
 311        for( x = 0; x < roiSize.width*cn; x++ )
 312        {
 313           imglutPtr[x] = lut[imgPtr[x]];
 314        }
 315     }
 316 
 317 #endif
 318     for( y = 0; y < gradsize.height; y++ )
 319     {
 320 #ifdef HAVE_IPP
 321         const float* imgPtr  = (float*)(lutimg.data + lutimg.step*ymap[y]);
 322         const float* prevPtr = (float*)(lutimg.data + lutimg.step*ymap[y-1]);
 323         const float* nextPtr = (float*)(lutimg.data + lutimg.step*ymap[y+1]);
 324 #else
 325     //imgPtr在這裏指的是img圖像的第y行首地址；prePtr指的是img第y-1行首地址；
 326     //nextPtr指的是img第y+1行首地址；
 327         const uchar* imgPtr  = img.data + img.step*ymap[y];
 328         const uchar* prevPtr = img.data + img.step*ymap[y-1];
 329         const uchar* nextPtr = img.data + img.step*ymap[y+1];
 330 #endif
 331         float* gradPtr = (float*)grad.ptr(y);
 332         uchar* qanglePtr = (uchar*)qangle.ptr(y);
 333     
 334     //輸入圖像img爲單通道圖像時的計算
 335         if( cn == 1 )
 336         {
 337             for( x = 0; x < width; x++ )
 338             {
 339                 int x1 = xmap[x];
 340 #ifdef HAVE_IPP
 341                 dbuf[x] = (float)(imgPtr[xmap[x+1]] - imgPtr[xmap[x-1]]);
 342                 dbuf[width + x] = (float)(nextPtr[x1] - prevPtr[x1]);
 343 #else
 344         //下面2句把Dx，Dy就計算出來了，因爲其對應的內存都在dbuf中
 345                 dbuf[x] = (float)(lut[imgPtr[xmap[x+1]]] - lut[imgPtr[xmap[x-1]]]);
 346                 dbuf[width + x] = (float)(lut[nextPtr[x1]] - lut[prevPtr[x1]]);
 347 #endif
 348             }
 349         }
 350     //當cn==3時，也就是輸入圖像爲3通道圖像時的處理。
 351         else
 352         {
 353             for( x = 0; x < width; x++ )
 354             {
 355         //x1表示第y行第x1列的地址
 356                 int x1 = xmap[x]*3;
 357                 float dx0, dy0, dx, dy, mag0, mag;
 358 #ifdef HAVE_IPP
 359                 const float* p2 = imgPtr + xmap[x+1]*3;
 360                 const float* p0 = imgPtr + xmap[x-1]*3;
 361 
 362                 dx0 = p2[2] - p0[2];
 363                 dy0 = nextPtr[x1+2] - prevPtr[x1+2];
 364                 mag0 = dx0*dx0 + dy0*dy0;
 365 
 366                 dx = p2[1] - p0[1];
 367                 dy = nextPtr[x1+1] - prevPtr[x1+1];
 368                 mag = dx*dx + dy*dy;
 369 
 370                 if( mag0 < mag )
 371                 {
 372                     dx0 = dx;
 373                     dy0 = dy;
 374                     mag0 = mag;
 375                 }
 376 
 377                 dx = p2[0] - p0[0];
 378                 dy = nextPtr[x1] - prevPtr[x1];
 379                 mag = dx*dx + dy*dy;
 380 #else
 381         //p2爲第y行第x+1列的地址
 382         //p0爲第y行第x-1列的地址
 383                 const uchar* p2 = imgPtr + xmap[x+1]*3;
 384                 const uchar* p0 = imgPtr + xmap[x-1]*3;
 385         
 386         //計算第2通道的幅值
 387                 dx0 = lut[p2[2]] - lut[p0[2]];
 388                 dy0 = lut[nextPtr[x1+2]] - lut[prevPtr[x1+2]];
 389                 mag0 = dx0*dx0 + dy0*dy0;
 390 
 391         //計算第1通道的幅值
 392                 dx = lut[p2[1]] - lut[p0[1]];
 393                 dy = lut[nextPtr[x1+1]] - lut[prevPtr[x1+1]];
 394                 mag = dx*dx + dy*dy;
 395 
 396         //取幅值最大的那個通道
 397                 if( mag0 < mag )
 398                 {
 399                     dx0 = dx;
 400                     dy0 = dy;
 401                     mag0 = mag;
 402                 }
 403 
 404         //計算第0通道的幅值
 405                 dx = lut[p2[0]] - lut[p0[0]];
 406                 dy = lut[nextPtr[x1]] - lut[prevPtr[x1]];
 407                 mag = dx*dx + dy*dy;
 408  #endif
 409         //取幅值最大的那個通道
 410                 if( mag0 < mag )
 411                 {
 412                     dx0 = dx;
 413                     dy0 = dy;
 414                     mag0 = mag;
 415                 }
 416 
 417                 //最後求出水平和垂直方向上的梯度圖像
 418         dbuf[x] = dx0;
 419                 dbuf[x+width] = dy0;
 420             }
 421         }
 422 #ifdef HAVE_IPP
 423         ippsCartToPolar_32f((const Ipp32f*)Dx.data, (const Ipp32f*)Dy.data, (Ipp32f*)Mag.data, pAngles, width);
 424         for( x = 0; x < width; x++ )
 425         {
 426            if(pAngles[x] < 0.f)
 427              pAngles[x] += (Ipp32f)(CV_PI*2.);
 428         }
 429 
 430         ippsNormalize_32f(pAngles, pAngles, width, 0.5f/angleScale, 1.f/angleScale);
 431         ippsFloor_32f(pAngles,(Ipp32f*)hidxs.data,width);
 432         ippsSub_32f_I((Ipp32f*)hidxs.data,pAngles,width);
 433         ippsMul_32f_I((Ipp32f*)Mag.data,pAngles,width);
 434 
 435         ippsSub_32f_I(pAngles,(Ipp32f*)Mag.data,width);
 436         ippsRealToCplx_32f((Ipp32f*)Mag.data,pAngles,(Ipp32fc*)gradPtr,width);
 437 #else
 438     //cartToPolar()函數是計算2個矩陣對應元素的幅度和角度，最後一個參數爲是否
 439     //角度使用度數表示，這裏爲false表示不用度數表示，即用弧度表示。
 440     //如果只需計算2個矩陣對應元素的幅度圖像，可以採用magnitude()函數。
 441     //-pi/2<Angle<pi/2;
 442         cartToPolar( Dx, Dy, Mag, Angle, false );
 443 #endif
 444         for( x = 0; x < width; x++ )
 445         {
 446 #ifdef HAVE_IPP
 447             int hidx = (int)pHidxs[x];
 448 #else
 449         //-5<angle<4
 450             float mag = dbuf[x+width*2], angle = dbuf[x+width*3]*angleScale - 0.5f;
 451             //cvFloor()返回不大於參數的最大整數
 452         //hidx={-5,-4,-3,-2,-1,0,1,2,3,4};
 453             int hidx = cvFloor(angle);
 454             //0<=angle<1;angle表示的意思是與其相鄰的較小的那個bin的弧度距離(即弧度差)
 455             angle -= hidx;
 456             //gradPtr爲grad圖像的指針
 457         //gradPtr[x*2]表示的是與x處梯度方向相鄰較小的那個bin的幅度權重；
 458         //gradPtr[x*2+1]表示的是與x處梯度方向相鄰較大的那個bin的幅度權重
 459         gradPtr[x*2] = mag*(1.f - angle);
 460             gradPtr[x*2+1] = mag*angle;
 461 #endif
 462             if( hidx < 0 )
 463                 hidx += _nbins;
 464             else if( hidx >= _nbins )
 465                 hidx -= _nbins;
 466             assert( (unsigned)hidx < (unsigned)_nbins );
 467 
 468             qanglePtr[x*2] = (uchar)hidx;
 469             hidx++;
 470             //-1在補碼中的表示爲11111111,與-1相與的話就是自己本身了；
 471         //0在補碼中的表示爲00000000,與0相與的結果就是0了.
 472             hidx &= hidx < _nbins ? -1 : 0;
 473             qanglePtr[x*2+1] = (uchar)hidx;
 474         }
 475     }
 476 }
 477 
 478 
 479 struct HOGCache
 480 {
 481     struct BlockData
 482     {
 483         BlockData() : histOfs(0), imgOffset() {}
 484         int histOfs;
 485         Point imgOffset;
 486     };
 487 
 488     struct PixData
 489     {
 490         size_t gradOfs, qangleOfs;
 491         int histOfs[4];
 492         float histWeights[4];
 493         float gradWeight;
 494     };
 495 
 496     HOGCache();
 497     HOGCache(const HOGDescriptor* descriptor,
 498         const Mat& img, Size paddingTL, Size paddingBR,
 499         bool useCache, Size cacheStride);
 500     virtual ~HOGCache() {};
 501     virtual void init(const HOGDescriptor* descriptor,
 502         const Mat& img, Size paddingTL, Size paddingBR,
 503         bool useCache, Size cacheStride);
 504 
 505     Size windowsInImage(Size imageSize, Size winStride) const;
 506     Rect getWindow(Size imageSize, Size winStride, int idx) const;
 507 
 508     const float* getBlock(Point pt, float* buf);
 509     virtual void normalizeBlockHistogram(float* histogram) const;
 510 
 511     vector<PixData> pixData;
 512     vector<BlockData> blockData;
 513 
 514     bool useCache;
 515     vector<int> ymaxCached;
 516     Size winSize, cacheStride;
 517     Size nblocks, ncells;
 518     int blockHistogramSize;
 519     int count1, count2, count4;
 520     Point imgoffset;
 521     Mat_<float> blockCache;
 522     Mat_<uchar> blockCacheFlags;
 523 
 524     Mat grad, qangle;
 525     const HOGDescriptor* descriptor;
 526 };
 527 
 528 //默認的構造函數,不使用cache,塊的直方圖向量大小爲0等
 529 HOGCache::HOGCache()
 530 {
 531     useCache = false;
 532     blockHistogramSize = count1 = count2 = count4 = 0;
 533     descriptor = 0;
 534 }
 535 
 536 //帶參的初始化函數，採用內部的init函數進行初始化
 537 HOGCache::HOGCache(const HOGDescriptor* _descriptor,
 538         const Mat& _img, Size _paddingTL, Size _paddingBR,
 539         bool _useCache, Size _cacheStride)
 540 {
 541     init(_descriptor, _img, _paddingTL, _paddingBR, _useCache, _cacheStride);
 542 }
 543 
 544 //HOGCache結構體的初始化函數
 545 void HOGCache::init(const HOGDescriptor* _descriptor,
 546         const Mat& _img, Size _paddingTL, Size _paddingBR,
 547         bool _useCache, Size _cacheStride)
 548 {
 549     descriptor = _descriptor;
 550     cacheStride = _cacheStride;
 551     useCache = _useCache;
 552 
 553     //首先調用computeGradient()函數計算輸入圖像的權值梯度幅度圖和角度量化圖
 554     descriptor->computeGradient(_img, grad, qangle, _paddingTL, _paddingBR);
 555     //imgoffset是Point類型，而_paddingTL是Size類型，雖然類型不同，但是2者都是
 556     //一個二維座標，所以是在opencv中是允許直接賦值的。
 557     imgoffset = _paddingTL;
 558 
 559     winSize = descriptor->winSize;
 560     Size blockSize = descriptor->blockSize;
 561     Size blockStride = descriptor->blockStride;
 562     Size cellSize = descriptor->cellSize;
 563     int i, j, nbins = descriptor->nbins;
 564     //rawBlockSize爲block中包含像素點的個數
 565     int rawBlockSize = blockSize.width*blockSize.height;
 566     
 567     //nblocks爲Size類型，其長和寬分別表示一個窗口中水平方向和垂直方向上block的
 568     //個數(需要考慮block在窗口中的移動)
 569     nblocks = Size((winSize.width - blockSize.width)/blockStride.width + 1,
 570                    (winSize.height - blockSize.height)/blockStride.height + 1);
 571     //ncells也是Size類型，其長和寬分別表示一個block中水平方向和垂直方向容納下
 572     //的cell個數
 573     ncells = Size(blockSize.width/cellSize.width, blockSize.height/cellSize.height);
 574     //blockHistogramSize表示一個block中貢獻給hog描述子向量的長度
 575     blockHistogramSize = ncells.width*ncells.height*nbins;
 576 
 577     if( useCache )
 578     {
 579         //cacheStride= _cacheStride,即其大小是由參數傳入的,表示的是窗口移動的大小
 580         //cacheSize長和寬表示擴充後的圖像cache中，block在水平方向和垂直方向出現的個數
 581         Size cacheSize((grad.cols - blockSize.width)/cacheStride.width+1,
 582                        (winSize.height/cacheStride.height)+1);
 583         //blockCache爲一個float型的Mat，注意其列數的值
 584         blockCache.create(cacheSize.height, cacheSize.width*blockHistogramSize);
 585         //blockCacheFlags爲一個uchar型的Mat
 586         blockCacheFlags.create(cacheSize);
 587         size_t cacheRows = blockCache.rows;
 588         //ymaxCached爲vector<int>類型
 589         //Mat::resize()爲矩陣的一個方法，只是改變矩陣的行數，與單獨的resize()函數不相同。
 590         ymaxCached.resize(cacheRows);
 591         //ymaxCached向量內部全部初始化爲-1
 592         for(size_t ii = 0; ii < cacheRows; ii++ )
 593             ymaxCached[ii] = -1;
 594     }
 595     
 596     //weights爲一個尺寸爲blockSize的二維高斯表,下面的代碼就是計算二維高斯的係數
 597     Mat_<float> weights(blockSize);
 598     float sigma = (float)descriptor->getWinSigma();
 599     float scale = 1.f/(sigma*sigma*2);
 600 
 601     for(i = 0; i < blockSize.height; i++)
 602         for(j = 0; j < blockSize.width; j++)
 603         {
 604             float di = i - blockSize.height*0.5f;
 605             float dj = j - blockSize.width*0.5f;
 606             weights(i,j) = std::exp(-(di*di + dj*dj)*scale);
 607         }
 608 
 609     //vector<BlockData> blockData;而BlockData爲HOGCache的一個結構體成員
 610     //nblocks.width*nblocks.height表示一個檢測窗口中block的個數，
 611     //而cacheSize.width*cacheSize.heigh表示一個已經擴充的圖片中的block的個數
 612     blockData.resize(nblocks.width*nblocks.height);
 613     //vector<PixData> pixData;同理，Pixdata也爲HOGCache中的一個結構體成員
 614     //rawBlockSize表示每個block中像素點的個數
 615     //resize表示將其轉換成列向量
 616     pixData.resize(rawBlockSize*3);
 617 
 618     // Initialize 2 lookup tables, pixData & blockData.
 619     // Here is why:
 620     //
 621     // The detection algorithm runs in 4 nested loops (at each pyramid layer):
 622     //  loop over the windows within the input image
 623     //    loop over the blocks within each window
 624     //      loop over the cells within each block
 625     //        loop over the pixels in each cell
 626     //
 627     // As each of the loops runs over a 2-dimensional array,
 628     // we could get 8(!) nested loops in total, which is very-very slow.
 629     //
 630     // To speed the things up, we do the following:
 631     //   1. loop over windows is unrolled in the HOGDescriptor::{compute|detect} methods;
 632     //         inside we compute the current search window using getWindow() method.
 633     //         Yes, it involves some overhead (function call + couple of divisions),
 634     //         but it's tiny in fact.
 635     //   2. loop over the blocks is also unrolled. Inside we use pre-computed blockData[j]
 636     //         to set up gradient and histogram pointers.
 637     //   3. loops over cells and pixels in each cell are merged
 638     //       (since there is no overlap between cells, each pixel in the block is processed once)
 639     //      and also unrolled. Inside we use PixData[k] to access the gradient values and
 640     //      update the histogram
 641     //count1,count2,count4分別表示block中同時對1個cell，2個cell，4個cell有貢獻的像素點的個數。
 642     count1 = count2 = count4 = 0;
 643     for( j = 0; j < blockSize.width; j++ )
 644         for( i = 0; i < blockSize.height; i++ )
 645         {
 646             PixData* data = 0;
 647             //cellX和cellY表示的是block內該像素點所在的cell橫座標和縱座標索引，以小數的形式存在。
 648             float cellX = (j+0.5f)/cellSize.width - 0.5f;
 649             float cellY = (i+0.5f)/cellSize.height - 0.5f;
 650             //cvRound返回最接近參數的整數;cvFloor返回不大於參數的整數;cvCeil返回不小於參數的整數
 651             //icellX0和icellY0表示所在cell座標索引，索引值爲該像素點相鄰cell的那個較小的cell索引
 652             //當然此處就是由整數的形式存在了。
 653             //按照默認的係數的話，icellX0和icellY0只可能取值-1,0,1,且當i和j<3.5時對應的值才取-1
 654             //當i和j>11.5時取值爲1，其它時刻取值爲0(注意i，j最大是15，從0開始的)
 655             int icellX0 = cvFloor(cellX);
 656             int icellY0 = cvFloor(cellY);
 657             int icellX1 = icellX0 + 1, icellY1 = icellY0 + 1;
 658             //此處的cellx和celly表示的是真實索引值與最近鄰cell索引值之間的差，
 659             //爲後面計算同一像素對不同cell中的hist權重的計算。
 660             cellX -= icellX0;
 661             cellY -= icellY0;
 662       
 663                //滿足這個if條件說明icellX0只能爲0,也就是說block橫座標在(3.5,11.5)之間時
 664             if( (unsigned)icellX0 < (unsigned)ncells.width &&
 665                 (unsigned)icellX1 < (unsigned)ncells.width )
 666             {
 667                //滿足這個if條件說明icellY0只能爲0,也就是說block縱座標在(3.5,11.5)之間時
 668                 if( (unsigned)icellY0 < (unsigned)ncells.height &&
 669                     (unsigned)icellY1 < (unsigned)ncells.height )
 670                 {
 671                     //同時滿足上面2個if語句的像素對4個cell都有權值貢獻
 672                     //rawBlockSize表示的是1個block中存儲像素點的個數
 673                     //而pixData的尺寸大小爲block中像素點的3倍，其定義如下：
 674                     //pixData.resize(rawBlockSize*3);
 675                     //pixData的前面block像素大小的內存爲存儲只對block中一個cell
 676                     //有貢獻的pixel；中間block像素大小的內存存儲對block中同時2個
 677                     //cell有貢獻的pixel；最後面的爲對block中同時4個cell都有貢獻
 678                     //的pixel
 679                     data = &pixData[rawBlockSize*2 + (count4++)];
 680                     //下面計算出的結果爲0
 681                     data->histOfs[0] = (icellX0*ncells.height + icellY0)*nbins;
 682                      //爲該像素點對cell0的權重
 683                     data->histWeights[0] = (1.f - cellX)*(1.f - cellY);
 684                     //下面計算出的結果爲18
 685                     data->histOfs[1] = (icellX1*ncells.height + icellY0)*nbins;
 686                     data->histWeights[1] = cellX*(1.f - cellY);
 687                     //下面計算出的結果爲9
 688                     data->histOfs[2] = (icellX0*ncells.height + icellY1)*nbins;
 689                     data->histWeights[2] = (1.f - cellX)*cellY;
 690                     //下面計算出的結果爲27
 691                     data->histOfs[3] = (icellX1*ncells.height + icellY1)*nbins;
 692                     data->histWeights[3] = cellX*cellY;
 693                 }
 694                 else
 695                    //滿足這個else條件說明icellY0取-1或者1,也就是說block縱座標在(0, 3.5)
 696                 //和(11.5, 15)之間.
 697                 //此時的像素點對相鄰的2個cell有權重貢獻
 698                 {
 699                     data = &pixData[rawBlockSize + (count2++)];                    
 700                     if( (unsigned)icellY0 < (unsigned)ncells.height )
 701                     {
 702                         //(unsigned)-1等於127>2，所以此處滿足if條件時icellY0==1；
 703                         //icellY1==1;
 704                         icellY1 = icellY0;
 705                         cellY = 1.f - cellY;
 706                     }
 707                     //不滿足if條件時，icellY0==-1;icellY1==0;
 708                     //當然了，這2種情況下icellX0==0;icellX1==1;
 709                     data->histOfs[0] = (icellX0*ncells.height + icellY1)*nbins;
 710                     data->histWeights[0] = (1.f - cellX)*cellY;
 711                     data->histOfs[1] = (icellX1*ncells.height + icellY1)*nbins;
 712                     data->histWeights[1] = cellX*cellY;
 713                     data->histOfs[2] = data->histOfs[3] = 0;
 714                     data->histWeights[2] = data->histWeights[3] = 0;
 715                 }
 716             }
 717             //當block中橫座標滿足在(0, 3.5)和(11.5, 15)範圍內時，即
 718             //icellX0==-1或==1
 719             else
 720             {
 721                 
 722                 if( (unsigned)icellX0 < (unsigned)ncells.width )
 723                 {
 724                     //icellX1=icllX0=1;
 725                     icellX1 = icellX0;
 726                     cellX = 1.f - cellX;
 727                 }
 728                 //當icllY0=0時，此時對2個cell有貢獻
 729                 if( (unsigned)icellY0 < (unsigned)ncells.height &&
 730                     (unsigned)icellY1 < (unsigned)ncells.height )
 731                 {                    
 732                     data = &pixData[rawBlockSize + (count2++)];
 733                     data->histOfs[0] = (icellX1*ncells.height + icellY0)*nbins;
 734                     data->histWeights[0] = cellX*(1.f - cellY);
 735                     data->histOfs[1] = (icellX1*ncells.height + icellY1)*nbins;
 736                     data->histWeights[1] = cellX*cellY;
 737                     data->histOfs[2] = data->histOfs[3] = 0;
 738                     data->histWeights[2] = data->histWeights[3] = 0;
 739                 }
 740                 else
 741                 //此時只對自身的cell有貢獻
 742                 {
 743                     data = &pixData[count1++];
 744                     if( (unsigned)icellY0 < (unsigned)ncells.height )
 745                     {
 746                         icellY1 = icellY0;
 747                         cellY = 1.f - cellY;
 748                     }
 749                     data->histOfs[0] = (icellX1*ncells.height + icellY1)*nbins;
 750                     data->histWeights[0] = cellX*cellY;
 751                     data->histOfs[1] = data->histOfs[2] = data->histOfs[3] = 0;
 752                     data->histWeights[1] = data->histWeights[2] = data->histWeights[3] = 0;
 753                 }
 754             }
 755             //爲什麼每個block中i,j位置的gradOfs和qangleOfs都相同且是如下的計算公式呢？
 756             //那是因爲輸入的_img參數不是代表整幅圖片而是檢測窗口大小的圖片，所以每個
 757             //檢測窗口中關於block的信息可以看做是相同的
 758             data->gradOfs = (grad.cols*i + j)*2;
 759             data->qangleOfs = (qangle.cols*i + j)*2;
 760             //每個block中i，j位置的權重都是固定的
 761             data->gradWeight = weights(i,j);
 762         }
 763 
 764     //保證所有的點都被掃描了一遍
 765     assert( count1 + count2 + count4 == rawBlockSize );
 766     // defragment pixData
 767     //將pixData中按照內存排滿，這樣節省了2/3的內存
 768     for( j = 0; j < count2; j++ )
 769         pixData[j + count1] = pixData[j + rawBlockSize];
 770     for( j = 0; j < count4; j++ )
 771         pixData[j + count1 + count2] = pixData[j + rawBlockSize*2];
 772     //此時count2表示至多對2個cell有貢獻的所有像素點的個數
 773     count2 += count1;
 774     //此時count4表示至多對4個cell有貢獻的所有像素點的個數
 775     count4 += count2;
 776 
 777     //上面是初始化pixData,下面開始初始化blockData
 778     // initialize blockData
 779     for( j = 0; j < nblocks.width; j++ )
 780         for( i = 0; i < nblocks.height; i++ )
 781         {
 782             BlockData& data = blockData[j*nblocks.height + i];
 783             //histOfs表示該block對檢測窗口貢獻的hog描述變量起點在整個
 784             //變量中的座標
 785             data.histOfs = (j*nblocks.height + i)*blockHistogramSize;
 786             //imgOffset表示該block的左上角在檢測窗口中的座標
 787             data.imgOffset = Point(j*blockStride.width,i*blockStride.height);
 788         }
 789         //一個檢測窗口對應一個blockData內存，一個block對應一個pixData內存。
 790 }
 791 
 792 
 793 //pt爲該block左上角在滑動窗口中的座標，buf爲指向檢測窗口中blocData的指針
 794 //函數返回一個block描述子的指針
 795 const float* HOGCache::getBlock(Point pt, float* buf)
 796 {
 797     float* blockHist = buf;
 798     assert(descriptor != 0);
 799 
 800     Size blockSize = descriptor->blockSize;
 801     pt += imgoffset;
 802 
 803     CV_Assert( (unsigned)pt.x <= (unsigned)(grad.cols - blockSize.width) &&
 804                (unsigned)pt.y <= (unsigned)(grad.rows - blockSize.height) );
 805 
 806     if( useCache )
 807     {
 808         //cacheStride可以認爲和blockStride是一樣的
 809         //保證所獲取到HOGCache是我們所需要的，即在block移動過程中會出現
 810         CV_Assert( pt.x % cacheStride.width == 0 &&
 811                    pt.y % cacheStride.height == 0 );
 812         //cacheIdx表示的是block個數的座標
 813         Point cacheIdx(pt.x/cacheStride.width,
 814                       (pt.y/cacheStride.height) % blockCache.rows);
 815         //ymaxCached的長度爲一個檢測窗口垂直方向上容納的block個數
 816         if( pt.y != ymaxCached[cacheIdx.y] )
 817         {
 818             //取出blockCacheFlags的第cacheIdx.y行並且賦值爲0
 819             Mat_<uchar> cacheRow = blockCacheFlags.row(cacheIdx.y);
 820             cacheRow = (uchar)0;
 821             ymaxCached[cacheIdx.y] = pt.y;
 822         }
 823 
 824         //blockHist指向該點對應block所貢獻的hog描述子向量，初始值爲空
 825         blockHist = &blockCache[cacheIdx.y][cacheIdx.x*blockHistogramSize];
 826         uchar& computedFlag = blockCacheFlags(cacheIdx.y, cacheIdx.x);
 827         if( computedFlag != 0 )
 828             return blockHist;
 829         computedFlag = (uchar)1; // set it at once, before actual computing
 830     }
 831 
 832     int k, C1 = count1, C2 = count2, C4 = count4;
 833     //
 834     const float* gradPtr = (const float*)(grad.data + grad.step*pt.y) + pt.x*2;
 835     const uchar* qanglePtr = qangle.data + qangle.step*pt.y + pt.x*2;
 836 
 837     CV_Assert( blockHist != 0 );
 838 #ifdef HAVE_IPP
 839     ippsZero_32f(blockHist,blockHistogramSize);
 840 #else
 841     for( k = 0; k < blockHistogramSize; k++ )
 842         blockHist[k] = 0.f;
 843 #endif
 844 
 845     const PixData* _pixData = &pixData[0];
 846 
 847     //C1表示只對自己所在cell有貢獻的點的個數
 848     for( k = 0; k < C1; k++ )
 849     {
 850         const PixData& pk = _pixData[k];
 851         //a表示的是幅度指針
 852         const float* a = gradPtr + pk.gradOfs;
 853         float w = pk.gradWeight*pk.histWeights[0];
 854         //h表示的是相位指針
 855         const uchar* h = qanglePtr + pk.qangleOfs;
 856 
 857         //幅度有2個通道是因爲每個像素點的幅值被分解到了其相鄰的兩個bin上了
 858         //相位有2個通道是因爲每個像素點的相位的相鄰處都有的2個bin的序號
 859         int h0 = h[0], h1 = h[1];
 860         float* hist = blockHist + pk.histOfs[0];
 861         float t0 = hist[h0] + a[0]*w;
 862         float t1 = hist[h1] + a[1]*w;
 863         //hist中放的爲加權的梯度值
 864         hist[h0] = t0; hist[h1] = t1;
 865     }
 866 
 867     for( ; k < C2; k++ )
 868     {
 869         const PixData& pk = _pixData[k];
 870         const float* a = gradPtr + pk.gradOfs;
 871         float w, t0, t1, a0 = a[0], a1 = a[1];
 872         const uchar* h = qanglePtr + pk.qangleOfs;
 873         int h0 = h[0], h1 = h[1];
 874 
 875         //因爲此時的像素對2個cell有貢獻，這是其中一個cell的貢獻
 876         float* hist = blockHist + pk.histOfs[0];
 877         w = pk.gradWeight*pk.histWeights[0];
 878         t0 = hist[h0] + a0*w;
 879         t1 = hist[h1] + a1*w;
 880         hist[h0] = t0; hist[h1] = t1;
 881 
 882         //另一個cell的貢獻
 883         hist = blockHist + pk.histOfs[1];
 884         w = pk.gradWeight*pk.histWeights[1];
 885         t0 = hist[h0] + a0*w;
 886         t1 = hist[h1] + a1*w;
 887         hist[h0] = t0; hist[h1] = t1;
 888     }
 889 
 890     //和上面類似
 891     for( ; k < C4; k++ )
 892     {
 893         const PixData& pk = _pixData[k];
 894         const float* a = gradPtr + pk.gradOfs;
 895         float w, t0, t1, a0 = a[0], a1 = a[1];
 896         const uchar* h = qanglePtr + pk.qangleOfs;
 897         int h0 = h[0], h1 = h[1];
 898 
 899         float* hist = blockHist + pk.histOfs[0];
 900         w = pk.gradWeight*pk.histWeights[0];
 901         t0 = hist[h0] + a0*w;
 902         t1 = hist[h1] + a1*w;
 903         hist[h0] = t0; hist[h1] = t1;
 904 
 905         hist = blockHist + pk.histOfs[1];
 906         w = pk.gradWeight*pk.histWeights[1];
 907         t0 = hist[h0] + a0*w;
 908         t1 = hist[h1] + a1*w;
 909         hist[h0] = t0; hist[h1] = t1;
 910 
 911         hist = blockHist + pk.histOfs[2];
 912         w = pk.gradWeight*pk.histWeights[2];
 913         t0 = hist[h0] + a0*w;
 914         t1 = hist[h1] + a1*w;
 915         hist[h0] = t0; hist[h1] = t1;
 916 
 917         hist = blockHist + pk.histOfs[3];
 918         w = pk.gradWeight*pk.histWeights[3];
 919         t0 = hist[h0] + a0*w;
 920         t1 = hist[h1] + a1*w;
 921         hist[h0] = t0; hist[h1] = t1;
 922     }
 923 
 924     normalizeBlockHistogram(blockHist);
 925 
 926     return blockHist;
 927 }
 928 
 929 
 930 void HOGCache::normalizeBlockHistogram(float* _hist) const
 931 {
 932     float* hist = &_hist[0];
 933 #ifdef HAVE_IPP
 934     size_t sz = blockHistogramSize;
 935 #else
 936     size_t i, sz = blockHistogramSize;
 937 #endif
 938 
 939     float sum = 0;
 940 #ifdef HAVE_IPP
 941     ippsDotProd_32f(hist,hist,sz,&sum);
 942 #else
 943     //第一次歸一化求的是平方和
 944     for( i = 0; i < sz; i++ )
 945         sum += hist[i]*hist[i];
 946 #endif
 947     //分母爲平方和開根號+0.1
 948     float scale = 1.f/(std::sqrt(sum)+sz*0.1f), thresh = (float)descriptor->L2HysThreshold;
 949 #ifdef HAVE_IPP
 950     ippsMulC_32f_I(scale,hist,sz);
 951     ippsThreshold_32f_I( hist, sz, thresh, ippCmpGreater );
 952     ippsDotProd_32f(hist,hist,sz,&sum);
 953 #else
 954     for( i = 0, sum = 0; i < sz; i++ )
 955     {
 956         //第2次歸一化是在第1次的基礎上繼續求平和和
 957         hist[i] = std::min(hist[i]*scale, thresh);
 958         sum += hist[i]*hist[i];
 959     }
 960 #endif
 961 
 962     scale = 1.f/(std::sqrt(sum)+1e-3f);
 963 #ifdef HAVE_IPP
 964     ippsMulC_32f_I(scale,hist,sz);
 965 #else
 966     //最終歸一化結果
 967     for( i = 0; i < sz; i++ )
 968         hist[i] *= scale;
 969 #endif
 970 }
 971 
 972 
 973 //返回測試圖片中水平方向和垂直方向共有多少個檢測窗口
 974 Size HOGCache::windowsInImage(Size imageSize, Size winStride) const
 975 {
 976     return Size((imageSize.width - winSize.width)/winStride.width + 1,
 977                 (imageSize.height - winSize.height)/winStride.height + 1);
 978 }
 979 
 980 
 981 //給定圖片的大小，已經檢測窗口滑動的大小和測試圖片中的檢測窗口的索引，得到該索引處
 982 //檢測窗口的尺寸，包括座標信息
 983 Rect HOGCache::getWindow(Size imageSize, Size winStride, int idx) const
 984 {
 985     int nwindowsX = (imageSize.width - winSize.width)/winStride.width + 1;
 986     int y = idx / nwindowsX;//商
 987     int x = idx - nwindowsX*y;//餘數
 988     return Rect( x*winStride.width, y*winStride.height, winSize.width, winSize.height );
 989 }
 990 
 991 
 992 void HOGDescriptor::compute(const Mat& img, vector<float>& descriptors,
 993                             Size winStride, Size padding,
 994                             const vector<Point>& locations) const
 995 {
 996     //Size()表示長和寬都是0
 997     if( winStride == Size() )
 998         winStride = cellSize;
 999     //gcd爲求最大公約數，如果採用默認值的話，則2者相同
1000     Size cacheStride(gcd(winStride.width, blockStride.width),
1001                      gcd(winStride.height, blockStride.height));
1002     size_t nwindows = locations.size();
1003     //alignSize(m, n)返回n的倍數大於等於m的最小值
1004     padding.width = (int)alignSize(std::max(padding.width, 0), cacheStride.width);
1005     padding.height = (int)alignSize(std::max(padding.height, 0), cacheStride.height);
1006     Size paddedImgSize(img.cols + padding.width*2, img.rows + padding.height*2);
1007 
1008     HOGCache cache(this, img, padding, padding, nwindows == 0, cacheStride);
1009 
1010     if( !nwindows )
1011         //Mat::area()表示爲Mat的面積
1012         nwindows = cache.windowsInImage(paddedImgSize, winStride).area();
1013 
1014     const HOGCache::BlockData* blockData = &cache.blockData[0];
1015 
1016     int nblocks = cache.nblocks.area();
1017     int blockHistogramSize = cache.blockHistogramSize;
1018     size_t dsize = getDescriptorSize();//一個hog的描述長度
1019     //resize()爲改變矩陣的行數，如果減少矩陣的行數則只保留減少後的
1020     //那些行，如果是增加行數，則保留所有的行。
1021     //這裏將描述子長度擴展到整幅圖片
1022     descriptors.resize(dsize*nwindows);
1023 
1024     for( size_t i = 0; i < nwindows; i++ )
1025     {
1026         //descriptor爲第i個檢測窗口的描述子首位置。
1027         float* descriptor = &descriptors[i*dsize];
1028        
1029         Point pt0;
1030         //非空
1031         if( !locations.empty() )
1032         {
1033             pt0 = locations[i];
1034             //非法的點
1035             if( pt0.x < -padding.width || pt0.x > img.cols + padding.width - winSize.width ||
1036                 pt0.y < -padding.height || pt0.y > img.rows + padding.height - winSize.height )
1037                 continue;
1038         }
1039         //locations爲空
1040         else
1041         {
1042             //pt0爲沒有擴充前圖像對應的第i個檢測窗口
1043             pt0 = cache.getWindow(paddedImgSize, winStride, (int)i).tl() - Point(padding);
1044             CV_Assert(pt0.x % cacheStride.width == 0 && pt0.y % cacheStride.height == 0);
1045         }
1046 
1047         for( int j = 0; j < nblocks; j++ )
1048         {
1049             const HOGCache::BlockData& bj = blockData[j];
1050             //pt爲block的左上角相對檢測圖片的座標
1051             Point pt = pt0 + bj.imgOffset;
1052 
1053             //dst爲該block在整個測試圖片的描述子的位置
1054             float* dst = descriptor + bj.histOfs;
1055             const float* src = cache.getBlock(pt, dst);
1056             if( src != dst )
1057 #ifdef HAVE_IPP
1058                ippsCopy_32f(src,dst,blockHistogramSize);
1059 #else
1060                 for( int k = 0; k < blockHistogramSize; k++ )
1061                     dst[k] = src[k];
1062 #endif
1063         }
1064     }
1065 }
1066 
1067 
1068 void HOGDescriptor::detect(const Mat& img,
1069     vector<Point>& hits, vector<double>& weights, double hitThreshold, 
1070     Size winStride, Size padding, const vector<Point>& locations) const
1071 {
1072     //hits裏面存的是符合檢測到目標的窗口的左上角頂點座標
1073     hits.clear();
1074     if( svmDetector.empty() )
1075         return;
1076 
1077     if( winStride == Size() )
1078         winStride = cellSize;
1079     Size cacheStride(gcd(winStride.width, blockStride.width),
1080                      gcd(winStride.height, blockStride.height));
1081     size_t nwindows = locations.size();
1082     padding.width = (int)alignSize(std::max(padding.width, 0), cacheStride.width);
1083     padding.height = (int)alignSize(std::max(padding.height, 0), cacheStride.height);
1084     Size paddedImgSize(img.cols + padding.width*2, img.rows + padding.height*2);
1085 
1086     HOGCache cache(this, img, padding, padding, nwindows == 0, cacheStride);
1087 
1088     if( !nwindows )
1089         nwindows = cache.windowsInImage(paddedImgSize, winStride).area();
1090 
1091     const HOGCache::BlockData* blockData = &cache.blockData[0];
1092 
1093     int nblocks = cache.nblocks.area();
1094     int blockHistogramSize = cache.blockHistogramSize;
1095     size_t dsize = getDescriptorSize();
1096 
1097     double rho = svmDetector.size() > dsize ? svmDetector[dsize] : 0;
1098     vector<float> blockHist(blockHistogramSize);
1099 
1100     for( size_t i = 0; i < nwindows; i++ )
1101     {
1102         Point pt0;
1103         if( !locations.empty() )
1104         {
1105             pt0 = locations[i];
1106             if( pt0.x < -padding.width || pt0.x > img.cols + padding.width - winSize.width ||
1107                 pt0.y < -padding.height || pt0.y > img.rows + padding.height - winSize.height )
1108                 continue;
1109         }
1110         else
1111         {
1112             pt0 = cache.getWindow(paddedImgSize, winStride, (int)i).tl() - Point(padding);
1113             CV_Assert(pt0.x % cacheStride.width == 0 && pt0.y % cacheStride.height == 0);
1114         }
1115         double s = rho;
1116         //svmVec指向svmDetector最前面那個元素
1117         const float* svmVec = &svmDetector[0];
1118 #ifdef HAVE_IPP
1119         int j;
1120 #else
1121         int j, k;
1122 #endif
1123         for( j = 0; j < nblocks; j++, svmVec += blockHistogramSize )
1124         {
1125             const HOGCache::BlockData& bj = blockData[j];
1126             Point pt = pt0 + bj.imgOffset;
1127             
1128             //vec爲測試圖片pt處的block貢獻的描述子指針
1129             const float* vec = cache.getBlock(pt, &blockHist[0]);
1130 #ifdef HAVE_IPP
1131             Ipp32f partSum;
1132             ippsDotProd_32f(vec,svmVec,blockHistogramSize,&partSum);
1133             s += (double)partSum;
1134 #else
1135             for( k = 0; k <= blockHistogramSize - 4; k += 4 )
1136                 //const float* svmVec = &svmDetector[0];
1137                 s += vec[k]*svmVec[k] + vec[k+1]*svmVec[k+1] +
1138                     vec[k+2]*svmVec[k+2] + vec[k+3]*svmVec[k+3];
1139             for( ; k < blockHistogramSize; k++ )
1140                 s += vec[k]*svmVec[k];
1141 #endif
1142         }
1143         if( s >= hitThreshold )
1144         {
1145             hits.push_back(pt0);
1146             weights.push_back(s);
1147         }
1148     }
1149 }
1150 
1151 //不用保留檢測到目標的可信度，即權重
1152 void HOGDescriptor::detect(const Mat& img, vector<Point>& hits, double hitThreshold, 
1153                            Size winStride, Size padding, const vector<Point>& locations) const
1154 {
1155     vector<double> weightsV;
1156     detect(img, hits, weightsV, hitThreshold, winStride, padding, locations);
1157 }
1158 
1159 struct HOGInvoker
1160 {
1161     HOGInvoker( const HOGDescriptor* _hog, const Mat& _img,
1162                 double _hitThreshold, Size _winStride, Size _padding,
1163                 const double* _levelScale, ConcurrentRectVector* _vec, 
1164                 ConcurrentDoubleVector* _weights=0, ConcurrentDoubleVector* _scales=0 ) 
1165     {
1166         hog = _hog;
1167         img = _img;
1168         hitThreshold = _hitThreshold;
1169         winStride = _winStride;
1170         padding = _padding;
1171         levelScale = _levelScale;
1172         vec = _vec;
1173         weights = _weights;
1174         scales = _scales;
1175     }
1176 
1177     void operator()( const BlockedRange& range ) const
1178     {
1179         int i, i1 = range.begin(), i2 = range.end();
1180         double minScale = i1 > 0 ? levelScale[i1] : i2 > 1 ? levelScale[i1+1] : std::max(img.cols, img.rows);
1181         //將原圖片進行縮放
1182         Size maxSz(cvCeil(img.cols/minScale), cvCeil(img.rows/minScale));
1183         Mat smallerImgBuf(maxSz, img.type());
1184         vector<Point> locations;
1185         vector<double> hitsWeights;
1186 
1187         for( i = i1; i < i2; i++ )
1188         {
1189             double scale = levelScale[i];
1190             Size sz(cvRound(img.cols/scale), cvRound(img.rows/scale));
1191             //smallerImg只是構造一個指針，並沒有複製數據
1192             Mat smallerImg(sz, img.type(), smallerImgBuf.data);
1193             //沒有尺寸縮放
1194             if( sz == img.size() )
1195                 smallerImg = Mat(sz, img.type(), img.data, img.step);
1196             //有尺寸縮放
1197             else
1198                 resize(img, smallerImg, sz);
1199             //該函數實際上是將返回的值存在locations和histWeights中
1200             //其中locations存的是目標區域的左上角座標
1201             hog->detect(smallerImg, locations, hitsWeights, hitThreshold, winStride, padding);
1202             Size scaledWinSize = Size(cvRound(hog->winSize.width*scale), cvRound(hog->winSize.height*scale));
1203             for( size_t j = 0; j < locations.size(); j++ )
1204             {
1205                 //保存目標區域
1206                 vec->push_back(Rect(cvRound(locations[j].x*scale),
1207                                     cvRound(locations[j].y*scale),
1208                                     scaledWinSize.width, scaledWinSize.height));
1209                 //保存縮放尺寸
1210                 if (scales) {
1211                     scales->push_back(scale);
1212                 }
1213             }
1214             //保存svm計算後的結果值
1215             if (weights && (!hitsWeights.empty()))
1216             {
1217                 for (size_t j = 0; j < locations.size(); j++)
1218                 {
1219                     weights->push_back(hitsWeights[j]);
1220                 }
1221             }        
1222         }
1223     }
1224 
1225     const HOGDescriptor* hog;
1226     Mat img;
1227     double hitThreshold;
1228     Size winStride;
1229     Size padding;
1230     const double* levelScale;
1231     //typedef tbb::concurrent_vector<Rect> ConcurrentRectVector;
1232     ConcurrentRectVector* vec;
1233     //typedef tbb::concurrent_vector<double> ConcurrentDoubleVector;
1234     ConcurrentDoubleVector* weights;
1235     ConcurrentDoubleVector* scales;
1236 };
1237 
1238 
1239 void HOGDescriptor::detectMultiScale(
1240     const Mat& img, vector<Rect>& foundLocations, vector<double>& foundWeights,
1241     double hitThreshold, Size winStride, Size padding,
1242     double scale0, double finalThreshold, bool useMeanshiftGrouping) const  
1243 {
1244     double scale = 1.;
1245     int levels = 0;
1246 
1247     vector<double> levelScale;
1248 
1249     //nlevels默認的是64層
1250     for( levels = 0; levels < nlevels; levels++ )
1251     {
1252         levelScale.push_back(scale);
1253         if( cvRound(img.cols/scale) < winSize.width ||
1254             cvRound(img.rows/scale) < winSize.height ||
1255             scale0 <= 1 )
1256             break;
1257         //只考慮測試圖片尺寸比檢測窗口尺寸大的情況
1258         scale *= scale0;
1259     }
1260     levels = std::max(levels, 1);
1261     levelScale.resize(levels);
1262 
1263     ConcurrentRectVector allCandidates;
1264     ConcurrentDoubleVector tempScales;
1265     ConcurrentDoubleVector tempWeights;
1266     vector<double> foundScales;
1267     
1268     //TBB並行計算
1269     parallel_for(BlockedRange(0, (int)levelScale.size()),
1270                  HOGInvoker(this, img, hitThreshold, winStride, padding, &levelScale[0], &allCandidates, &tempWeights, &tempScales));
1271     //將tempScales中的內容複製到foundScales中；back_inserter是指在指定參數迭代器的末尾插入數據
1272     std::copy(tempScales.begin(), tempScales.end(), back_inserter(foundScales));
1273     //容器的clear()方法是指移除容器中所有的數據
1274     foundLocations.clear();
1275     //將候選目標窗口保存在foundLocations中
1276     std::copy(allCandidates.begin(), allCandidates.end(), back_inserter(foundLocations));
1277     foundWeights.clear();
1278     //將候選目標可信度保存在foundWeights中
1279     std::copy(tempWeights.begin(), tempWeights.end(), back_inserter(foundWeights));
1280 
1281     if ( useMeanshiftGrouping )
1282     {
1283         groupRectangles_meanshift(foundLocations, foundWeights, foundScales, finalThreshold, winSize);
1284     }
1285     else
1286     {
1287         //對矩形框進行聚類
1288         groupRectangles(foundLocations, (int)finalThreshold, 0.2);
1289     }
1290 }
1291 
1292 //不考慮目標的置信度
1293 void HOGDescriptor::detectMultiScale(const Mat& img, vector<Rect>& foundLocations, 
1294                                      double hitThreshold, Size winStride, Size padding,
1295                                      double scale0, double finalThreshold, bool useMeanshiftGrouping) const  
1296 {
1297     vector<double> foundWeights;
1298     detectMultiScale(img, foundLocations, foundWeights, hitThreshold, winStride, 
1299                      padding, scale0, finalThreshold, useMeanshiftGrouping);
1300 }
1301 
1302 typedef RTTIImpl<HOGDescriptor> HOGRTTI;
1303 
1304 CvType hog_type( CV_TYPE_NAME_HOG_DESCRIPTOR, HOGRTTI::isInstance,
1305                  HOGRTTI::release, HOGRTTI::read, HOGRTTI::write, HOGRTTI::clone);
1306 
1307 vector<float> HOGDescriptor::getDefaultPeopleDetector()
1308 {
1309     static const float detector[] = {
1310        0.05359386f, -0.14721455f, -0.05532170f, 0.05077307f,
1311        0.11547081f, -0.04268804f, 0.04635834f, ........
1312   };
1313        //返回detector數組的從頭到尾構成的向量
1314     return vector<float>(detector, detector + sizeof(detector)/sizeof(detector[0]));
1315 }
1316 //This function renurn 1981 SVM coeffs obtained from daimler's base. 
1317 //To use these coeffs the detection window size should be (48,96)  
1318 vector<float> HOGDescriptor::getDaimlerPeopleDetector()
1319 {
1320     static const float detector[] = {
1321         0.294350f, -0.098796f, -0.129522f, 0.078753f,
1322         0.387527f, 0.261529f, 0.145939f, 0.061520f,
1323       ........
1324         };
1325         //返回detector的首尾構成的向量
1326         return vector<float>(detector, detector + sizeof(detector)/sizeof(detector[0]));
1327 }
1328 
1329 }

objdetect.hpp中關於hog的部分:

  1 //////////////// HOG (Histogram-of-Oriented-Gradients) Descriptor and Object Detector //////////////
  2 
  3 struct CV_EXPORTS_W HOGDescriptor
  4 {
  5 public:
  6     enum { L2Hys=0 };
  7     enum { DEFAULT_NLEVELS=64 };
  8 
  9     CV_WRAP HOGDescriptor() : winSize(64,128), blockSize(16,16), blockStride(8,8),
 10         cellSize(8,8), nbins(9), derivAperture(1), winSigma(-1),
 11         histogramNormType(HOGDescriptor::L2Hys), L2HysThreshold(0.2), gammaCorrection(true),
 12         nlevels(HOGDescriptor::DEFAULT_NLEVELS)
 13     {}
 14 
 15     //可以用構造函數的參數來作爲冒號外的參數初始化傳入，這樣定義該類的時候，一旦變量分配了
 16     //內存，則馬上會被初始化，而不用等所有變量分配完內存後再初始化。
 17     CV_WRAP HOGDescriptor(Size _winSize, Size _blockSize, Size _blockStride,
 18                   Size _cellSize, int _nbins, int _derivAperture=1, double _winSigma=-1,
 19                   int _histogramNormType=HOGDescriptor::L2Hys,
 20                   double _L2HysThreshold=0.2, bool _gammaCorrection=false,
 21                   int _nlevels=HOGDescriptor::DEFAULT_NLEVELS)
 22     : winSize(_winSize), blockSize(_blockSize), blockStride(_blockStride), cellSize(_cellSize),
 23     nbins(_nbins), derivAperture(_derivAperture), winSigma(_winSigma),
 24     histogramNormType(_histogramNormType), L2HysThreshold(_L2HysThreshold),
 25     gammaCorrection(_gammaCorrection), nlevels(_nlevels)
 26     {}
 27 
 28     //可以導入文本文件進行初始化
 29     CV_WRAP HOGDescriptor(const String& filename)
 30     {
 31         load(filename);
 32     }
 33 
 34     HOGDescriptor(const HOGDescriptor& d)
 35     {
 36         d.copyTo(*this);
 37     }
 38 
 39     virtual ~HOGDescriptor() {}
 40 
 41     //size_t是一個long unsigned int型
 42     CV_WRAP size_t getDescriptorSize() const;
 43     CV_WRAP bool checkDetectorSize() const;
 44     CV_WRAP double getWinSigma() const;
 45 
 46     //virtual爲虛函數，在指針或引用時起函數多態作用
 47     CV_WRAP virtual void setSVMDetector(InputArray _svmdetector);
 48 
 49     virtual bool read(FileNode& fn);
 50     virtual void write(FileStorage& fs, const String& objname) const;
 51 
 52     CV_WRAP virtual bool load(const String& filename, const String& objname=String());
 53     CV_WRAP virtual void save(const String& filename, const String& objname=String()) const;
 54     virtual void copyTo(HOGDescriptor& c) const;
 55 
 56     CV_WRAP virtual void compute(const Mat& img,
 57                          CV_OUT vector<float>& descriptors,
 58                          Size winStride=Size(), Size padding=Size(),
 59                          const vector<Point>& locations=vector<Point>()) const;
 60     //with found weights output
 61     CV_WRAP virtual void detect(const Mat& img, CV_OUT vector<Point>& foundLocations,
 62                         CV_OUT vector<double>& weights,
 63                         double hitThreshold=0, Size winStride=Size(),
 64                         Size padding=Size(),
 65                         const vector<Point>& searchLocations=vector<Point>()) const;
 66     //without found weights output
 67     virtual void detect(const Mat& img, CV_OUT vector<Point>& foundLocations,
 68                         double hitThreshold=0, Size winStride=Size(),
 69                         Size padding=Size(),
 70                         const vector<Point>& searchLocations=vector<Point>()) const;
 71     //with result weights output
 72     CV_WRAP virtual void detectMultiScale(const Mat& img, CV_OUT vector<Rect>& foundLocations,
 73                                   CV_OUT vector<double>& foundWeights, double hitThreshold=0,
 74                                   Size winStride=Size(), Size padding=Size(), double scale=1.05,
 75                                   double finalThreshold=2.0,bool useMeanshiftGrouping = false) const;
 76     //without found weights output
 77     virtual void detectMultiScale(const Mat& img, CV_OUT vector<Rect>& foundLocations,
 78                                   double hitThreshold=0, Size winStride=Size(),
 79                                   Size padding=Size(), double scale=1.05,
 80                                   double finalThreshold=2.0, bool useMeanshiftGrouping = false) const;
 81 
 82     CV_WRAP virtual void computeGradient(const Mat& img, CV_OUT Mat& grad, CV_OUT Mat& angleOfs,
 83                                  Size paddingTL=Size(), Size paddingBR=Size()) const;
 84 
 85     CV_WRAP static vector<float> getDefaultPeopleDetector();
 86     CV_WRAP static vector<float> getDaimlerPeopleDetector();
 87 
 88     CV_PROP Size winSize;
 89     CV_PROP Size blockSize;
 90     CV_PROP Size blockStride;
 91     CV_PROP Size cellSize;
 92     CV_PROP int nbins;
 93     CV_PROP int derivAperture;
 94     CV_PROP double winSigma;
 95     CV_PROP int histogramNormType;
 96     CV_PROP double L2HysThreshold;
 97     CV_PROP bool gammaCorrection;
 98     CV_PROP vector<float> svmDetector;
 99     CV_PROP int nlevels;
100 };

　　　　十、總結

　　　　該源碼的作者採用了一些加快算法速度的優化手段，比如前面講到的緩存查找表技術，同時程序中也使用了intel的多線程技術，即TBB並行技術等。

　　　　讀源碼花了一些時間，不過收穫也不少，很佩服寫出這些代碼的作者。

漚江一流

發佈了32 篇原創文章 · 獲贊 51 · 訪問量 26萬+

私信關注

（OpenCV）HOG源碼分析

linux安裝cuda和cudnn

Mellanox網卡開啓SR-IOV

模擬手機設備：使用 Playwright 實現移動端自動化測試

全面系統的AI學習路徑，幫助普通人也能玩轉AI

HTML 00 Tutorial

從零開始：使用 Playwright 腳本錄製實現自動化測試

uni-app實現上拉加載

vue3編譯優化之“靜態提升”

又是一個月-20240513

flask 如何保證返回json有序

（OpenCV）SVM：從理論到OpenCV實踐

（OpenCV）HOG：從理論到OpenCV實踐

（Caffe）Eclipse調試的Python接口（動態鏈接庫_caffe.so）

（Caffe，LeNet）IDE單步調試（一）

（Cuda）流Stream（三）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結