KAZE系列筆記:
1. OpenCV學習筆記(27)KAZE 算法原理與源碼分析(一)非線性擴散濾波
2. OpenCV學習筆記(28)KAZE 算法原理與源碼分析(二)非線性尺度空間構建
3. OpenCV學習筆記(29)KAZE 算法原理與源碼分析(三)特徵檢測與描述
4. OpenCV學習筆記(30)KAZE 算法原理與源碼分析(四)KAZE特徵的性能分析與比較
5. OpenCV學習筆記(31)KAZE 算法原理與源碼分析(五)KAZE的性能優化及與SIFT的比較
==================================================================================================
KAZE算法資源:
1. 論文: http://www.robesafe.com/personal/pablo.alcantarilla/papers/Alcantarilla12eccv.pdf
2. 項目主頁:http://www.robesafe.com/personal/pablo.alcantarilla/kaze.html
3. 作者代碼:http://www.robesafe.com/personal/pablo.alcantarilla/code/kaze_features_1_4.tar
(需要boost庫,另外其計時函數的使用比較複雜,可以用OpenCV的cv::getTickCount代替)
4. Computer Vision Talks的評測:http://computer-vision-talks.com/2013/03/porting-kaze-features-to-opencv/
5. Computer Vision Talks 博主Ievgen Khvedchenia將KAZE集成到OpenCV的cv::Feature2D類,但需要重新編譯OpenCV,並且沒有實現算法參數調整和按Mask過濾特徵點的功能:https://github.com/BloodAxe/opencv/tree/kaze-features
6. 我在Ievgen的項目庫中提取出KAZE,封裝成繼承cv::Feature2D的類,無需重新編譯OpenCV,實現了參數調整和Mask過濾的功能: https://github.com/yuhuazou/kaze_opencv (2013-03-28更新,對KAZE代碼進行了優化)
7. Matlab 版的接口程序,封裝了1.0版的KAZE代碼:https://github.com/vlfeat/vlbenchmarks/blob/unstable/%2BlocalFeatures/Kaze.m
==================================================================================================
2.2 KAZE特徵檢測與描述
KAZE特徵的檢測步驟大致如下:
1) 首先通過AOS算法和可變傳導擴散(Variable Conductance Diffusion)([4,5])方法來構造非線性尺度空間。
2) 檢測感興趣特徵點,這些特徵點在非線性尺度空間上經過尺度歸一化後的Hessian矩陣行列式是局部極大值(3×3鄰域)。
3) 計算特徵點的主方向,並且基於一階微分圖像提取具有尺度和旋轉不變性的描述向量。
2.2.1 非線性尺度空間的構造
KAZE特徵的尺度空間構造與SIFT類似。尺度級別按對數遞增,共有O組octaves,每個octave有S個sub-level。與SIFT中每個新octave逐層進行降採樣不同的是,KAZE的各個層級均採用與原始圖像相同的分辨率。不同的octave和sub-level分別通過序號o和s來標記,並且通過下式與尺度參數σ相對應:
其中σ0是尺度參數的初始基準值,N=O*S是整個尺度空間包含的圖像總數。由前面的介紹知道,非線性擴散濾波模型是以時間爲單位的,因此我們需要將像素爲單位的尺度參數σi轉換至時間單位。在高斯尺度空間下,使用標準差爲σ的高斯覈對圖像進行卷積,相當於對圖像進行持續時間爲t=σ2/2的濾波(In the case of the Gaussian scale space, the convolution of an image with a Gaussian of standard deviation σ (in pixels) is equivalent to filtering the image for some time t=σ2/2. 這段話不大好理解)。由此我們可得到尺度參數σi轉換至時間單位的映射公式如下:
ti被稱爲進化時間(evolution time)。值得注意的是,這種映射僅用於獲取一組進化時間值,並通過這些時間值來構造非線性尺度空間。一般地,在非線性尺度空間裏,與ti對應的濾波圖像(filtered image)與使用標準差爲σ的高斯覈對原始圖像進行卷積所得的圖像並沒有直接聯繫(In general, in the nonlinear scale space at each filtered image ti the resulting image does not correspond with the convolution of the original image with a Gaussian of standard deviation σi. 這句話也不好理解)。不過只要使傳導函數g恆等於1(即g是一個常量函數),非線性尺度空間就等同於高斯尺度空間。而且隨着尺度層級的提升,除了那些對應於目標輪廓的圖像邊緣像素外,大部分像素對應的傳導函數值將趨於一個常量值。
對於一幅輸入圖像,KAZE算法首先對其進行高斯濾波;然後計算圖像的梯度直方圖,從而獲取對比度參數k;根據一組進化時間,利用AOS算法即可得到非線性尺度空間的所有圖像:
具體實現
在具體實現時,採用以下的結構體 tevolution 來承載每一個尺度空間的相關參數:
typedef struct
{
cv::Mat Lx, Ly; // 一階微分圖像(First order spatial derivatives)
cv::Mat Lxx, Lxy, Lyy; // 二階微分圖像(Second order spatial derivatives)
cv::Mat Lflow; // 傳導圖像(Diffusivity image)
cv::Mat Lt; // 進化圖像(Evolution image)
cv::Mat Lsmooth; // 平滑圖像(Smoothed image)
cv::Mat Lstep; // 進化步長更新矩陣(Evolution step update)(!!實際未被使用!!)
cv::Mat Ldet; // 檢測響應矩陣(Detector response)
float etime; // 進化時間(Evolution time)
float esigma; // 進化尺度(Evolution sigma. For linear diffusion t = sigma^2 / 2)
float octave; // 圖像組(Image octave)
float sublevel; // 圖像層級(Image sublevel in each octave)
int sigma_size; // 圖像尺度參數的整數值,用於計算檢測響應(Integer esigma. For computing the feature detector responses)
}tevolution;
結構體的初始化如下:
//*******************************************************************************
//*******************************************************************************
/**
* @brief This method allocates the memory for the nonlinear diffusion evolution
*/
void KAZE::Allocate_Memory_Evolution(void)
{
// Allocate the dimension of the matrices for the evolution
for( int i = 0; i <= omax-1; i++ )
{
for( int j = 0; j <= nsublevels-1; j++ )
{
tevolution aux;
aux.Lx = cv::Mat::zeros(img_height,img_width,CV_32F);
aux.Ly = cv::Mat::zeros(img_height,img_width,CV_32F);
aux.Lxx = cv::Mat::zeros(img_height,img_width,CV_32F);
aux.Lxy = cv::Mat::zeros(img_height,img_width,CV_32F);
aux.Lyy = cv::Mat::zeros(img_height,img_width,CV_32F);
aux.Lflow = cv::Mat::zeros(img_height,img_width,CV_32F);
aux.Lt = cv::Mat::zeros(img_height,img_width,CV_32F);
aux.Lsmooth = cv::Mat::zeros(img_height,img_width,CV_32F);
aux.Lstep = cv::Mat::zeros(img_height,img_width,CV_32F);
aux.Ldet = cv::Mat::zeros(img_height,img_width,CV_32F);
aux.esigma = soffset*pow((float)2.0,(float)(j)/(float)(nsublevels) + i);
aux.etime = 0.5*(aux.esigma*aux.esigma);
aux.sigma_size = fRound(aux.esigma);
aux.octave = i;
aux.sublevel = j;
evolution.push_back(aux);
}
}
// Allocate memory for the auxiliary variables that are used in the AOS scheme
Ltx = cv::Mat::zeros(img_width,img_height,CV_32F);
Lty = cv::Mat::zeros(img_height,img_width,CV_32F);
px = cv::Mat::zeros(img_height,img_width,CV_32F);
py = cv::Mat::zeros(img_height,img_width,CV_32F);
ax = cv::Mat::zeros(img_height,img_width,CV_32F);
ay = cv::Mat::zeros(img_height,img_width,CV_32F);
bx = cv::Mat::zeros(img_height-1,img_width,CV_32F);
by = cv::Mat::zeros(img_height-1,img_width,CV_32F);
qr = cv::Mat::zeros(img_height-1,img_width,CV_32F);
qc = cv::Mat::zeros(img_height,img_width-1,CV_32F);
}
值得注意的是,上述參數中,esigma、etime、esigma_size、octave和sublevel等在初始化後就固定了、不再變化。初始化完成後,首先進行非線性尺度空間的構造,其對應函數爲:
//*******************************************************************************
//*******************************************************************************
/**
* @brief This method creates the nonlinear scale space for a given image
* @param img Input image for which the nonlinear scale space needs to be created
* @return 0 if the nonlinear scale space was created successfully. -1 otherwise
*/
int KAZE::Create_Nonlinear_Scale_Space(const cv::Mat &img)
{
if( verbosity == true )
{
std::cout << "\n> Creating nonlinear scale space." << std::endl;
}
double t2 = 0.0, t1 = 0.0;
if( evolution.size() == 0 )
{
std::cout << "---> Error generating the nonlinear scale space!!" << std::endl;
std::cout << "---> Firstly you need to call KAZE::Allocate_Memory_Evolution()" << std::endl;
return -1;
}
int64 start_t1 = cv::getTickCount();
// Copy the original image to the first level of the evolution
if( verbosity == true )
{
std::cout << "-> Perform the Gaussian smoothing." << std::endl;
}
img.copyTo(evolution[0].Lt);
Gaussian_2D_Convolution(evolution[0].Lt,evolution[0].Lt,0,0,soffset);
Gaussian_2D_Convolution(evolution[0].Lt,evolution[0].Lsmooth,0,0,sderivatives);
// Firstly compute the kcontrast factor
Compute_KContrast(evolution[0].Lt,KCONTRAST_PERCENTILE);
t2 = cv::getTickCount();
tkcontrast = 1000.0 * (t2 - start_t1) / cv::getTickFrequency();
if( verbosity == true )
{
std::cout << "-> Computed K-contrast factor. Execution time (ms): " << tkcontrast << std::endl;
std::cout << "-> Now computing the nonlinear scale space!!" << std::endl;
}
// Now generate the rest of evolution levels
for( unsigned int i = 1; i < evolution.size(); i++ )
{
Gaussian_2D_Convolution(evolution[i-1].Lt,evolution[i].Lsmooth,0,0,sderivatives);
// Compute the Gaussian derivatives Lx and Ly
Image_Derivatives_Scharr(evolution[i].Lsmooth,evolution[i].Lx,1,0);
Image_Derivatives_Scharr(evolution[i].Lsmooth,evolution[i].Ly,0,1);
// Compute the conductivity equation
if( diffusivity == 0 )
{
PM_G1(evolution[i].Lsmooth,evolution[i].Lflow,evolution[i].Lx,evolution[i].Ly,kcontrast);
}
else if( diffusivity == 1 )
{
PM_G2(evolution[i].Lsmooth,evolution[i].Lflow,evolution[i].Lx,evolution[i].Ly,kcontrast);
}
else if( diffusivity == 2 )
{
Weickert_Diffusivity(evolution[i].Lsmooth,evolution[i].Lflow,evolution[i].Lx,evolution[i].Ly,kcontrast);
}
// Perform the evolution step with AOS
#if HAVE_THREADING_SUPPORT
AOS_Step_Scalar_Parallel(evolution[i].Lt,evolution[i-1].Lt,evolution[i].Lflow,evolution[i].etime-evolution[i-1].etime);
#else
AOS_Step_Scalar(evolution[i].Lt,evolution[i-1].Lt,evolution[i].Lflow,evolution[i].etime-evolution[i-1].etime);
#endif
if( verbosity == true )
{
std::cout << "--> Computed image evolution step " << i << " Evolution time: " << evolution[i].etime <<
" Sigma: " << evolution[i].esigma << std::endl;
}
}
t2 = cv::getTickCount();
tnlscale = 1000.0*(t2-start_t1) / cv::getTickFrequency();
if( verbosity == true )
{
std::cout << "> Computed the nonlinear scale space. Execution time (ms): " << tnlscale << std::endl;
}
return 0;
}
上述函數中K值的計算、傳導函數g以及AOS求解等的實現函數在上一篇文章《非線性擴散濾波》已提及。圖像微分/梯度的計算用到了Scharr濾波器,這種濾波器具有比Sobel濾波器更好的旋轉不變特性。這裏涉及的卷積和微分計算函數如下:
//*************************************************************************************
//*************************************************************************************
/**
* @brief This function smoothes an image with a Gaussian kernel
* @param src Input image
* @param dst Output image
* @param ksize_x Kernel size in X-direction (horizontal)
* @param ksize_y Kernel size in Y-direction (vertical)
* @param sigma Kernel standard deviation
*/
void Gaussian_2D_Convolution(const cv::Mat &src, cv::Mat &dst, unsigned int ksize_x,
unsigned int ksize_y, float sigma)
{
// Compute an appropriate kernel size according to the specified sigma
if( sigma > ksize_x || sigma > ksize_y || ksize_x == 0 || ksize_y == 0 )
{
ksize_x = ceil(2.0*(1.0 + (sigma-0.8)/(0.3)));
ksize_y = ksize_x;
}
// The kernel size must be and odd number
if( (ksize_x % 2) == 0 )
{
ksize_x += 1;
}
if( (ksize_y % 2) == 0 )
{
ksize_y += 1;
}
// Perform the Gaussian Smoothing with border replication
cv::GaussianBlur(src,dst,cv::Size(ksize_x,ksize_y),sigma,sigma,cv::BORDER_REPLICATE);
}
//*************************************************************************************
//*************************************************************************************
/**
* @brief This function computes image derivatives with Scharr kernel
* @param src Input image
* @param dst Output image
* @param xorder Derivative order in X-direction (horizontal)
* @param yorder Derivative order in Y-direction (vertical)
* @note Scharr operator approximates better rotation invariance than
* other stencils such as Sobel. See Weickert and Scharr,
* A Scheme for Coherence-Enhancing Diffusion Filtering with Optimized Rotation Invariance,
* Journal of Visual Communication and Image Representation 2002
*/
void Image_Derivatives_Scharr(const cv::Mat &src, cv::Mat &dst, unsigned int xorder, unsigned int yorder)
{
// Compute Scharr filter
cv::Scharr(src,dst,CV_32F,xorder,yorder,1,0,cv::BORDER_DEFAULT);
}
上面我們介紹了非線性尺度空間的構造原理和實現方法,下一節將介紹KAZE特徵點的檢測和描述算法。
待續...
Ref:
[4] http://wenku.baidu.com/view/d9dffc34f111f18583d05a6f.html
[5] http://erie.nlm.nih.gov/~yoo/pubs/94-058.pdf