SIFT

SIFT features

Scale Invariant Feature Transform (SIFT) is an approach for detecting and extracting local feature descriptors that are reasonably invariant to change in illumination, image noise, rotation, scaling, and small changes in viewpoint.

SIFT是一種可以檢測並計算出對於在光照、圖像噪點、旋轉、縮放和視點變換時提取出不變的局部特徵值的算法。

Detection stages for SIFT features:

·         Scale-space extrema detection

·         Keypoint localization

·         Orientation assignment

·         Generation of keypoint descriptors

SIFT特徵值的計算步驟:

·         檢測尺度空間的極值點

·         定位關鍵點

·         分配指向

·         關鍵點的描述符的生成

 

Scale-space extrema detection

高斯

 可能理解有誤,有待後期確定!!具體參照SIFT進階

 


1.       Local extrema detection, the pixel marked ‘x’ is compared against its 26 neighbours in a 3*3*3 neighbourhood that spans adjacent DoG images (from Lowe, 2004)

1.       局部極值的檢測,被標記爲‘x’的像素會與其周圍的26個像素比較。(即同層的8個像素,以及上下相鄰的兩層的一共18個像素比較。)

2.       If the pixel is a local maximum or minimum, it is selected as a candidate keypoint.

2.       如果像素是局部的最大值或者最小值,則會被作爲備選的特徵值。

For each candidate keypoint:

·         Interpolation of nearby data is used to accurately determine its position.

·         Keypoints with low contrast are removed

·         Responses along edges are eliminated

·         The keypoint is assigned an orientation

對於每一個備選特徵值來說:

·         與附近的數據做插值運算是爲了能保證其精確的定位

·         去掉對比較低的特徵值

·         消除邊緣迴應

·         爲特徵值分配指向

To determine the keypoint orientation, a gradient orientation histogram is computed in the neighbourhood of the keypoint.

爲了定位特徵值的指向,會使用到一個計算出的周圍特徵值的變換指向直方圖來表示。

Peaks in the histogram correspond to dominant orientations. A separate keypoint is created for the direction corresponding to the histogram maximum, and any other direction within 80% of the maximum value.

直方圖中的峯值就是主方向,其他的達到最大值80%的方向可作爲輔助方向。

All the properties of the keypoint are measured relative to the keypoint orientation, this provides invariance to rotation.

特徵值的所有性質都與特徵點的指向相關,這樣對於旋轉來說就是不變的了。

SIFT feature representation


Once a keypoint orientation has been selected, the feature descriptor is computed as a set of orientation histograms on 4*4 pixel neighbourhoods. The orientation histograms are relative to the keypoint orientation, the orientation data comes from the Gaussian image closest in scale to the keypoint’s scale.

如果一個關鍵點的指向選定之後,特徵描述符就會按照一個鄰近4×4像素的指向直方圖來計算。指向直方圖與關鍵點的指向相關,指向數據來源於高斯圖像最接近關鍵點的尺度的那些值。

Just like before, the contribution of each pixel is weighted by the gradient magnitude, and a Gaussian with σ 1.5 times the scale of the keypoint.

與之前類似,每一個像素的影響是按照它的梯度的加權來的。

Histograms contain 8 bins each, and each descriptor contains an array of 4 histograms around the keypoint. This leads to a SIFT feature vector with 4*4*8 = 128 elements. This vector is normalized to enhance invariance to changes in illumination.

每個直方圖有8方向的梯度方向,每一個描述符包含一個位於關鍵點附近的四個直方圖數組。這就導致了SIFT的特徵向量有128維。(先是一個4×4的來計算出一個直方圖,每個直方圖有8個方向。所以是4×4×8=128維)將這個向量歸一化之後,就進一步去除了光照的影響。

 

SIFT feature matching

·         Find nearest neighbour in a database of SIFT features from training images.

·         For robustness, use ratio of nearest neighbour to ratio of second nearest neighbour

·         Neighbour with minimum Euclidean distance -> expensive search

·         Use an approximate, fast method to find nearest neighbour with high probability.

SIFT特徵擬合

·         在圖片的SIFT特徵值數據庫中找到最近的特徵值

·         爲了使的更爲健全,使用最近一個點的比去閉上第二近的點

·         計算臨近的點的最短歐式距離(很費時的查找)

·         使用一個大概、但是更快的且可能性更高的方法來查找最近的鄰居。

Recognition using SIFT features

·         Compute SIFT features on the input image

·         Match these features to the SIFT feature database

·         Each keypoint specifies 4 parameters: 2D location, scale, and orientation.

·         To increase recognition robustness: Hough transform to identify clusters of matches that vote for the same object pose.

·         Each keypoint votes for the set of object poses that are consistent with the keypoint’s location, scale, and orientation.

·         Locations in the Hough accumulator that accumulate at least 3 votes are selected as candidate object/pose matches.

·         A verification step matches the training image for the hypothesized object/pose to the image using a least-square fit to the hypothesized location, scale, and orientation of the object.

使用SIFT特徵值進行識別

·         計算輸入圖像的SIFT特徵值

·         把這些特徵值與SIFT特徵值數據庫進行匹配擬合

·         每一個關鍵點詳述了4個參數:二維的位置、尺度及指向

·         爲了提高識別的健壯性:使用霍夫轉換來識別一串匹配的點,這些點指示了相同的物體姿態

·         每一個關鍵點指示的物體姿態應該與關鍵點的位置、尺度和指向一致

·         霍夫疊加器疊加了至少3個提議的位置會被選爲備選的物體/姿態擬合

·         覈實階段,使用實驗圖片去匹配輸入圖片的假定的物體/姿態,這時使用的是最小二平方來擬合假定的物體位置、尺度和指向。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章