【翻譯:OpenCV-Python教程】均值漂移和連續自適應均值漂移

⚠️這篇是按4.1.0翻譯的,你懂得。

⚠️除了版本之外,其他還是照舊,Feature Matching + Homography to find Objects,原文

目標

在本章,

  • 我們會學到均值漂移和連續自適應均值漂移算法來找出並追蹤視頻中的物體。

均值漂移

均值漂移背後的靈感很簡單。假設即有一組點。(它可以是一個好像直方圖反向投射出來的像素分佈)。而你有一個小窗口(也許是個圓形的)然後你得移動那個窗口到最大點密度(或者窗口框住說最多點數量)的區域。如下圖所示:

meanshift_basics.jpg

初始化的窗口由藍色的圓"C1"來表示。它最初的圓心用藍色的正方形"C1_o"標記出來了。但如果你找出這個圓形窗口中所有點的質心,你就會得到"C1_r"點(用藍色小圓圈標出了),它是這個圓實際的質心。很明顯它們(質心和圓心)並不匹配,所以,移動窗口,使新窗口的圓圈與先前的質心相匹配。然後再次找到新的質心。很可能,它們還是不匹配。然後再繼續遞歸移動這個圓圈,直到圓的中心和質心落在同一點(或者有一個小的期望誤差)。最後得到的是一個具有最大像素分佈的窗口。它用綠色的圓"C2"標記出來了。如圖所示,它框住了最多的點。下面在一張靜態圖像上演示整個過程:

meanshift_face.gif

所以我們通常傳入直方圖的反向投影圖像以及一個初始對象的位置。當這個對象移動的時候,移動明顯會反映到直方圖的反向投影圖上。因此,均值漂移算法將窗口移動到具有最大的密度的新位置。

OpenCV裏的均值漂移

要使用OpenCV裏的均值漂移,首先我們需要設置目標對象,找出它的直方圖,這樣我們才能在每一幀上來進行反向映射,然後使用均值漂移算法。我們還需要提供窗口的初始位置。對於直方圖,這裏只考慮色調。此外,爲了避免由於低光而產生的錯誤值,使用cv.inRange()函數丟棄低光值。

import numpy as np
import cv2 as cv
cap = cv.VideoCapture('slow.flv')
# take first frame of the video
ret,frame = cap.read()
# setup initial location of window
r,h,c,w = 250,90,400,125  # simply hardcoded the values
track_window = (c,r,w,h)
# set up the ROI for tracking
roi = frame[r:r+h, c:c+w]
hsv_roi =  cv.cvtColor(roi, cv.COLOR_BGR2HSV)
mask = cv.inRange(hsv_roi, np.array((0., 60.,32.)), np.array((180.,255.,255.)))
roi_hist = cv.calcHist([hsv_roi],[0],mask,[180],[0,180])
cv.normalize(roi_hist,roi_hist,0,255,cv.NORM_MINMAX)
# Setup the termination criteria, either 10 iteration or move by atleast 1 pt
term_crit = ( cv.TERM_CRITERIA_EPS | cv.TERM_CRITERIA_COUNT, 10, 1 )
while(1):
    ret ,frame = cap.read()
    if ret == True:
        hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
        dst = cv.calcBackProject([hsv],[0],roi_hist,[0,180],1)
        # apply meanshift to get the new location
        ret, track_window = cv.meanShift(dst, track_window, term_crit)
        # Draw it on image
        x,y,w,h = track_window
        img2 = cv.rectangle(frame, (x,y), (x+w,y+h), 255,2)
        cv.imshow('img2',img2)
        k = cv.waitKey(60) & 0xff
        if k == 27:
            break
        else:
            cv.imwrite(chr(k)+".jpg",img2)
    else:
        break
cv.destroyAllWindows()
cap.release()

下面給出我使用的視頻中的其中三幀:

meanshift_result.jpg

連續自適應的均值漂移

你是否靠近去看了剛纔那個結果呢?Did you closely watch the last result? There is a problem. Our window always has the same size when car is farther away and it is very close to camera. That is not good. We need to adapt the window size with size and rotation of the target. Once again, the solution came from "OpenCV Labs" and it is called CAMshift (Continuously Adaptive Meanshift) published by Gary Bradsky in his paper "Computer Vision Face Tracking for Use in a Perceptual User Interface" in 1998.

It applies meanshift first. Once meanshift converges, it updates the size of the window as, s=2×M00256‾‾‾‾√. It also calculates the orientation of best fitting ellipse to it. Again it applies the meanshift with new scaled search window and previous window location. The process is continued until required accuracy is met.

camshift_face.gif

image

Camshift in OpenCV

It is almost same as meanshift, but it returns a rotated rectangle (that is our result) and box parameters (used to be passed as search window in next iteration). See the code below:

import numpy as np

import cv2 as cv

cap = cv.VideoCapture('slow.flv')

# take first frame of the video

ret,frame = cap.read()

# setup initial location of window

r,h,c,w = 250,90,400,125 # simply hardcoded the values

track_window = (c,r,w,h)

# set up the ROI for tracking

roi = frame[r:r+h, c:c+w]

hsv_roi = cv.cvtColor(roi, cv.COLOR_BGR2HSV)

mask = cv.inRange(hsv_roi, np.array((0., 60.,32.)), np.array((180.,255.,255.)))

roi_hist = cv.calcHist([hsv_roi],[0],mask,[180],[0,180])

cv.normalize(roi_hist,roi_hist,0,255,cv.NORM_MINMAX)

# Setup the termination criteria, either 10 iteration or move by atleast 1 pt

term_crit = ( cv.TERM_CRITERIA_EPS | cv.TERM_CRITERIA_COUNT, 10, 1 )

while(1):

ret ,frame = cap.read()

if ret == True:

hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)

dst = cv.calcBackProject([hsv],[0],roi_hist,[0,180],1)

# apply meanshift to get the new location

ret, track_window = cv.CamShift(dst, track_window, term_crit)

# Draw it on image

pts = cv.boxPoints(ret)

pts = np.int0(pts)

img2 = cv.polylines(frame,[pts],True, 255,2)

cv.imshow('img2',img2)

k = cv.waitKey(60) & 0xff

if k == 27:

break

else:

cv.imwrite(chr(k)+".jpg",img2)

else:

break

cv.destroyAllWindows()

cap.release()

Three frames of the result is shown below:

camshift_result.jpg

image

Additional Resources

  1. French Wikipedia page on Camshift. (The two animations are taken from here)
  2. Bradski, G.R., "Real time face and object tracking as a component of a perceptual user interface," Applications of Computer Vision, 1998. WACV '98. Proceedings., Fourth IEEE Workshop on , vol., no., pp.214,219, 19-21 Oct 1998

Exercises

  1. OpenCV comes with a Python sample on interactive demo of camshift. Use it, hack it, understand it.

上篇:【翻譯:OpenCV-Python教程】特徵匹配加單應性找出物體

下篇:【翻譯:OpenCV-Python教程】圖像金字塔

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章