圖像分割之（四）OpenCV的GrabCut函數使用和源碼解讀

http://blog.csdn.net/zouxy09
上一文對GrabCut做了一個瞭解。OpenCV中的GrabCut算法是依據《"GrabCut" - Interactive Foreground Extraction using Iterated Graph Cuts》這篇文章來實現的。現在我對源碼做了些註釋，以便我們更深入的瞭解該算法。一直覺得論文和代碼是有比較大的差別的，個人覺得脫離代碼看論文，最多能看懂70%，剩下20%或者更多就需要通過閱讀代碼來獲得了，那還有10%就和每個人的基礎和知識儲備相掛鉤了。
接觸時間有限，若有錯誤，還望各位前輩指正，謝謝。原論文的一些淺解見上一博文：
http://blog.csdn.net/zouxy09/article/details/8534954
一、GrabCut函數使用
在OpenCV的源碼目錄的samples的文件夾下，有grabCut的使用例程，請參考：
opencv\samples\cpp\grabcut.cpp。
而grabCut函數的API說明如下：
void cv::grabCut( InputArray _img, InputOutputArray _mask, Rect rect,
InputOutputArray _bgdModel, InputOutputArray _fgdModel,
int iterCount, int mode )
****參數說明：
img——待分割的源圖像，必須是8位3通道（CV_8UC3）圖像，在處理的過程中不會被修改；
mask——掩碼圖像，如果使用掩碼進行初始化，那麼mask保存初始化掩碼信息；在執行分割的時候，也可以將用戶交互所設定的前景與背景保存到mask中，然後再傳入grabCut函數；在處理結束之後，mask中會保存結果。mask只能取以下四種值：
GCD_BGD（=0），背景；
GCD_FGD（=1），前景；
GCD_PR_BGD（=2），可能的背景；
GCD_PR_FGD（=3），可能的前景。
如果沒有手工標記GCD_BGD或者GCD_FGD，那麼結果只會有GCD_PR_BGD或GCD_PR_FGD；
rect——用於限定需要進行分割的圖像範圍，只有該矩形窗口內的圖像部分才被處理；
bgdModel——背景模型，如果爲null，函數內部會自動創建一個bgdModel；bgdModel必須是單通道浮點型（CV_32FC1）圖像，且行數只能爲1，列數只能爲13x5；
fgdModel——前景模型，如果爲null，函數內部會自動創建一個fgdModel；fgdModel必須是單通道浮點型（CV_32FC1）圖像，且行數只能爲1，列數只能爲13x5；
iterCount——迭代次數，必須大於0；
mode——用於指示grabCut函數進行什麼操作，可選的值有：
GC_INIT_WITH_RECT（=0），用矩形窗初始化GrabCut；
GC_INIT_WITH_MASK（=1），用掩碼圖像初始化GrabCut；
GC_EVAL（=2），執行分割。
二、GrabCut源碼解讀
其中源碼包含了gcgraph.hpp這個構建圖和max flow/min cut算法的實現文件，這個文件暫時沒有解讀，後面再更新了。
[cpp]view
plain copy

/*M/////////////////////////////////////////////////////////////////////////////////////// 

// 

//  IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING. 

// 

//  By downloading, copying, installing or using the software you agree to this license. 

//  If you do not agree to this license, do not download, install, 

//  copy or use the software. 

// 

// 

//                        Intel License Agreement 

//                For Open Source Computer Vision Library 

// 

// Copyright (C) 2000, Intel Corporation, all rights reserved. 

// Third party copyrights are property of their respective owners. 

// 

// Redistribution and use in source and binary forms, with or without modification, 

// are permitted provided that the following conditions are met: 

// 

//   * Redistribution's of source code must retain the above copyright notice, 

//     this list of conditions and the following disclaimer. 

// 

//   * Redistribution's in binary form must reproduce the above copyright notice, 

//     this list of conditions and the following disclaimer in the documentation 

//     and/or other materials provided with the distribution. 

// 

//   * The name of Intel Corporation may not be used to endorse or promote products 

//     derived from this software without specific prior written permission. 

// 

// This software is provided by the copyright holders and contributors "as is" and 

// any express or implied warranties, including, but not limited to, the implied 

// warranties of merchantability and fitness for a particular purpose are disclaimed. 

// In no event shall the Intel Corporation or contributors be liable for any direct, 

// indirect, incidental, special, exemplary, or consequential damages 

// (including, but not limited to, procurement of substitute goods or services; 

// loss of use, data, or profits; or business interruption) however caused 

// and on any theory of liability, whether in contract, strict liability, 

// or tort (including negligence or otherwise) arising in any way out of 

// the use of this software, even if advised of the possibility of such damage. 

// 

//M*/  

#include "precomp.hpp"  

#include "gcgraph.hpp"  

#include <limits>  

using namespace cv;  

/* 

This is implementation of image segmentation algorithm GrabCut described in 

"GrabCut — Interactive Foreground Extraction using Iterated Graph Cuts". 

Carsten Rother, Vladimir Kolmogorov, Andrew Blake. 

 */  

/* 

 GMM - Gaussian Mixture Model 

*/  

class GMM  

{  

public:  

    static const int componentsCount = 5;  

    GMM( Mat& _model );  

    double operator()( const Vec3d color ) const;  

    double operator()( int ci, const Vec3d color ) const;  

    int whichComponent( const Vec3d color ) const;  

    void initLearning();  

    void addSample( int ci, const Vec3d color );  

    void endLearning();  

private:  

    void calcInverseCovAndDeterm( int ci );  

    Mat model;  

    double* coefs;  

    double* mean;  

    double* cov;  

    double inverseCovs[componentsCount][3][3]; //協方差的逆矩陣  

    double covDeterms[componentsCount];  //協方差的行列式  

    double sums[componentsCount][3];  

    double prods[componentsCount][3][3];  

    int sampleCounts[componentsCount];  

    int totalSampleCount;  

};  

//背景和前景各有一個對應的GMM（混合高斯模型）  

GMM::GMM( Mat& _model )  

{  

    //一個像素的（唯一對應）高斯模型的參數個數或者說一個高斯模型的參數個數  

    //一個像素RGB三個通道值，故3個均值，3*3個協方差，共用一個權值  

    const int modelSize = 3/*mean*/ + 9/*covariance*/ + 1/*component weight*/;  

    if( _model.empty() )  

    {  

        //一個GMM共有componentsCount個高斯模型，一個高斯模型有modelSize個模型參數  

        _model.create( 1, modelSize*componentsCount, CV_64FC1 );  

        _model.setTo(Scalar(0));  

    }  

    else if( (_model.type() != CV_64FC1) || (_model.rows != 1) || (_model.cols != modelSize*componentsCount) )  

        CV_Error( CV_StsBadArg, "_model must have CV_64FC1 type, rows == 1 and cols == 13*componentsCount" );  

    model = _model;  

    //注意這些模型參數的存儲方式：先排完componentsCount個coefs，再3*componentsCount個mean。  

    //再3*3*componentsCount個cov。  

    coefs = model.ptr<double>(0);  //GMM的每個像素的高斯模型的權值變量起始存儲指針  

    mean = coefs + componentsCount; //均值變量起始存儲指針  

    cov = mean + 3*componentsCount;  //協方差變量起始存儲指針  

    for( int ci = 0; ci < componentsCount; ci++ )  

        if( coefs[ci] > 0 )  

             //計算GMM中第ci個高斯模型的協方差的逆Inverse和行列式Determinant  

             //爲了後面計算每個像素屬於該高斯模型的概率（也就是數據能量項）  

             calcInverseCovAndDeterm( ci );   

}  

//計算一個像素（由color=（B,G,R）三維double型向量來表示）屬於這個GMM混合高斯模型的概率。  

//也就是把這個像素像素屬於componentsCount個高斯模型的概率與對應的權值相乘再相加，  

//具體見論文的公式（10）。結果從res返回。  

//這個相當於計算Gibbs能量的第一個能量項（取負後）。  

double GMM::operator()( const Vec3d color ) const  

{  

    double res = 0;  

    for( int ci = 0; ci < componentsCount; ci++ )  

        res += coefs[ci] * (*this)(ci, color );  

    return res;  

}  

//計算一個像素（由color=（B,G,R）三維double型向量來表示）屬於第ci個高斯模型的概率。  

//具體過程，即高階的高斯密度模型計算式，具體見論文的公式（10）。結果從res返回  

double GMM::operator()( int ci, const Vec3d color ) const  

{  

    double res = 0;  

    if( coefs[ci] > 0 )  

    {  

        CV_Assert( covDeterms[ci] > std::numeric_limits<double>::epsilon() );  

        Vec3d diff = color;  

        double* m = mean + 3*ci;  

        diff[0] -= m[0]; diff[1] -= m[1]; diff[2] -= m[2];  

        double mult = diff[0]*(diff[0]*inverseCovs[ci][0][0] + diff[1]*inverseCovs[ci][1][0] + diff[2]*inverseCovs[ci][2][0])  

                   + diff[1]*(diff[0]*inverseCovs[ci][0][1] + diff[1]*inverseCovs[ci][1][1] + diff[2]*inverseCovs[ci][2][1])  

                   + diff[2]*(diff[0]*inverseCovs[ci][0][2] + diff[1]*inverseCovs[ci][1][2] + diff[2]*inverseCovs[ci][2][2]);  

        res = 1.0f/sqrt(covDeterms[ci]) * exp(-0.5f*mult);  

    }  

    return res;  

}  

//返回這個像素最有可能屬於GMM中的哪個高斯模型（概率最大的那個）  

int GMM::whichComponent( const Vec3d color ) const  

{  

    int k = 0;  

    double max = 0;  

    for( int ci = 0; ci < componentsCount; ci++ )  

    {  

        double p = (*this)( ci, color );  

        if( p > max )  

        {  

            k = ci;  //找到概率最大的那個，或者說計算結果最大的那個  

            max = p;  

        }  

    }  

    return k;  

}  

//GMM參數學習前的初始化，主要是對要求和的變量置零  

void GMM::initLearning()  

{  

    for( int ci = 0; ci < componentsCount; ci++)  

    {  

        sums[ci][0] = sums[ci][1] = sums[ci][2] = 0;  

        prods[ci][0][0] = prods[ci][0][1] = prods[ci][0][2] = 0;  

        prods[ci][1][0] = prods[ci][1][1] = prods[ci][1][2] = 0;  

        prods[ci][2][0] = prods[ci][2][1] = prods[ci][2][2] = 0;  

        sampleCounts[ci] = 0;  

    }  

    totalSampleCount = 0;  

}  

//增加樣本，即爲前景或者背景GMM的第ci個高斯模型的像素集（這個像素集是來用估  

//計計算這個高斯模型的參數的）增加樣本像素。計算加入color這個像素後，像素集  

//中所有像素的RGB三個通道的和sums（用來計算均值），還有它的prods（用來計算協方差），  

//並且記錄這個像素集的像素個數和總的像素個數（用來計算這個高斯模型的權值）。  

void GMM::addSample( int ci, const Vec3d color )  

{  

    sums[ci][0] += color[0]; sums[ci][1] += color[1]; sums[ci][2] += color[2];  

    prods[ci][0][0] += color[0]*color[0]; prods[ci][0][1] += color[0]*color[1]; prods[ci][0][2] += color[0]*color[2];  

    prods[ci][1][0] += color[1]*color[0]; prods[ci][1][1] += color[1]*color[1]; prods[ci][1][2] += color[1]*color[2];  

    prods[ci][2][0] += color[2]*color[0]; prods[ci][2][1] += color[2]*color[1]; prods[ci][2][2] += color[2]*color[2];  

    sampleCounts[ci]++;  

    totalSampleCount++;  

}  

//從圖像數據中學習GMM的參數：每一個高斯分量的權值、均值和協方差矩陣；  

//這裏相當於論文中“Iterative minimisation”的step 2  

void GMM::endLearning()  

{  

    const double variance = 0.01;  

    for( int ci = 0; ci < componentsCount; ci++ )  

    {  

        int n = sampleCounts[ci]; //第ci個高斯模型的樣本像素個數  

        if( n == 0 )  

            coefs[ci] = 0;  

        else  

        {  

            //計算第ci個高斯模型的權值係數  

            coefs[ci] = (double)n/totalSampleCount;   

            //計算第ci個高斯模型的均值  

            double* m = mean + 3*ci;  

            m[0] = sums[ci][0]/n; m[1] = sums[ci][1]/n; m[2] = sums[ci][2]/n;  

            //計算第ci個高斯模型的協方差  

            double* c = cov + 9*ci;  

            c[0] = prods[ci][0][0]/n - m[0]*m[0]; c[1] = prods[ci][0][1]/n - m[0]*m[1]; c[2] = prods[ci][0][2]/n - m[0]*m[2];  

            c[3] = prods[ci][1][0]/n - m[1]*m[0]; c[4] = prods[ci][1][1]/n - m[1]*m[1]; c[5] = prods[ci][1][2]/n - m[1]*m[2];  

            c[6] = prods[ci][2][0]/n - m[2]*m[0]; c[7] = prods[ci][2][1]/n - m[2]*m[1]; c[8] = prods[ci][2][2]/n - m[2]*m[2];  

            //計算第ci個高斯模型的協方差的行列式  

            double dtrm = c[0]*(c[4]*c[8]-c[5]*c[7]) - c[1]*(c[3]*c[8]-c[5]*c[6]) + c[2]*(c[3]*c[7]-c[4]*c[6]);  

            if( dtrm <= std::numeric_limits<double>::epsilon() )  

            {  

                //相當於如果行列式小於等於0，（對角線元素）增加白噪聲，避免其變  

                //爲退化（降秩）協方差矩陣（不存在逆矩陣，但後面的計算需要計算逆矩陣）。  

                // Adds the white noise to avoid singular covariance matrix.  

                c[0] += variance;  

                c[4] += variance;  

                c[8] += variance;  

            }  

            //計算第ci個高斯模型的協方差的逆Inverse和行列式Determinant  

            calcInverseCovAndDeterm(ci);  

        }  

    }  

}  

//計算協方差的逆Inverse和行列式Determinant  

void GMM::calcInverseCovAndDeterm( int ci )  

{  

    if( coefs[ci] > 0 )  

    {  

        //取第ci個高斯模型的協方差的起始指針  

        double *c = cov + 9*ci;  

        double dtrm =  

              covDeterms[ci] = c[0]*(c[4]*c[8]-c[5]*c[7]) - c[1]*(c[3]*c[8]-c[5]*c[6])   

                                + c[2]*(c[3]*c[7]-c[4]*c[6]);  

        //在C++中，每一種內置的數據類型都擁有不同的屬性, 使用<limits>庫可以獲  

        //得這些基本數據類型的數值屬性。因爲浮點算法的截斷，所以使得，當a=2，  

        //b=3時 10*a/b == 20/b不成立。那怎麼辦呢？  

        //這個小正數（epsilon）常量就來了，小正數通常爲可用給定數據類型的  

        //大於1的最小值與1之差來表示。若dtrm結果不大於小正數，那麼它幾乎爲零。  

        //所以下式保證dtrm>0，即行列式的計算正確（協方差對稱正定，故行列式大於0）。  

        CV_Assert( dtrm > std::numeric_limits<double>::epsilon() );  

        //三階方陣的求逆  

        inverseCovs[ci][0][0] =  (c[4]*c[8] - c[5]*c[7]) / dtrm;  

        inverseCovs[ci][1][0] = -(c[3]*c[8] - c[5]*c[6]) / dtrm;  

        inverseCovs[ci][2][0] =  (c[3]*c[7] - c[4]*c[6]) / dtrm;  

        inverseCovs[ci][0][1] = -(c[1]*c[8] - c[2]*c[7]) / dtrm;  

        inverseCovs[ci][1][1] =  (c[0]*c[8] - c[2]*c[6]) / dtrm;  

        inverseCovs[ci][2][1] = -(c[0]*c[7] - c[1]*c[6]) / dtrm;  

        inverseCovs[ci][0][2] =  (c[1]*c[5] - c[2]*c[4]) / dtrm;  

        inverseCovs[ci][1][2] = -(c[0]*c[5] - c[2]*c[3]) / dtrm;  

        inverseCovs[ci][2][2] =  (c[0]*c[4] - c[1]*c[3]) / dtrm;  

    }  

}  

//計算beta，也就是Gibbs能量項中的第二項（平滑項）中的指數項的beta，用來調整  

//高或者低對比度時，兩個鄰域像素的差別的影響的，例如在低對比度時，兩個鄰域  

//像素的差別可能就會比較小，這時候需要乘以一個較大的beta來放大這個差別，  

//在高對比度時，則需要縮小本身就比較大的差別。  

//所以我們需要分析整幅圖像的對比度來確定參數beta，具體的見論文公式（5）。  

/* 

  Calculate beta - parameter of GrabCut algorithm. 

  beta = 1/(2*avg(sqr(||color[i] - color[j]||))) 

*/  

static double calcBeta( const Mat& img )  

{  

    double beta = 0;  

    for( int y = 0; y < img.rows; y++ )  

    {  

        for( int x = 0; x < img.cols; x++ )  

        {  

            //計算四個方向鄰域兩像素的差別，也就是歐式距離或者說二階範數  

            //（當所有像素都算完後，就相當於計算八鄰域的像素差了）  

            Vec3d color = img.at<Vec3b>(y,x);  

            if( x>0 ) // left  >0的判斷是爲了避免在圖像邊界的時候還計算，導致越界  

            {  

                Vec3d diff = color - (Vec3d)img.at<Vec3b>(y,x-1);  

                beta += diff.dot(diff);  //矩陣的點乘，也就是各個元素平方的和  

            }  

            if( y>0 && x>0 ) // upleft  

            {  

                Vec3d diff = color - (Vec3d)img.at<Vec3b>(y-1,x-1);  

                beta += diff.dot(diff);  

            }  

            if( y>0 ) // up  

            {  

                Vec3d diff = color - (Vec3d)img.at<Vec3b>(y-1,x);  

                beta += diff.dot(diff);  

            }  

            if( y>0 && x<img.cols-1) // upright  

            {  

                Vec3d diff = color - (Vec3d)img.at<Vec3b>(y-1,x+1);  

                beta += diff.dot(diff);  

            }  

        }  

    }  

    if( beta <= std::numeric_limits<double>::epsilon() )  

        beta = 0;  

    else  

        beta = 1.f / (2 * beta/(4*img.cols*img.rows - 3*img.cols - 3*img.rows + 2) ); //論文公式（5）  

    return beta;  

}  

//計算圖每個非端點頂點（也就是每個像素作爲圖的一個頂點，不包括源點s和匯點t）與鄰域頂點  

//的邊的權值。由於是無向圖，我們計算的是八鄰域，那麼對於一個頂點，我們計算四個方向就行，  

//在其他的頂點計算的時候，會把剩餘那四個方向的權值計算出來。這樣整個圖算完後，每個頂點  

//與八鄰域的頂點的邊的權值就都計算出來了。  

//這個相當於計算Gibbs能量的第二個能量項（平滑項），具體見論文中公式（4）  

/* 

  Calculate weights of noterminal vertices of graph. 

  beta and gamma - parameters of GrabCut algorithm. 

 */  

static void calcNWeights( const Mat& img, Mat& leftW, Mat& upleftW, Mat& upW,   

                            Mat& uprightW, double beta, double gamma )  

{  

    //gammaDivSqrt2相當於公式（4）中的gamma * dis(i,j)^(-1)，那麼可以知道，  

    //當i和j是垂直或者水平關係時，dis(i,j)=1，當是對角關係時，dis(i,j)=sqrt(2.0f)。  

    //具體計算時，看下面就明白了  

    const double gammaDivSqrt2 = gamma / std::sqrt(2.0f);  

    //每個方向的邊的權值通過一個和圖大小相等的Mat來保存  

    leftW.create( img.rows, img.cols, CV_64FC1 );  

    upleftW.create( img.rows, img.cols, CV_64FC1 );  

    upW.create( img.rows, img.cols, CV_64FC1 );  

    uprightW.create( img.rows, img.cols, CV_64FC1 );  

    for( int y = 0; y < img.rows; y++ )  

    {  

        for( int x = 0; x < img.cols; x++ )  

        {  

            Vec3d color = img.at<Vec3b>(y,x);  

            if( x-1>=0 ) // left  //避免圖的邊界  

            {  

                Vec3d diff = color - (Vec3d)img.at<Vec3b>(y,x-1);  

                leftW.at<double>(y,x) = gamma * exp(-beta*diff.dot(diff));  

            }  

            else  

                leftW.at<double>(y,x) = 0;  

            if( x-1>=0 && y-1>=0 ) // upleft  

            {  

                Vec3d diff = color - (Vec3d)img.at<Vec3b>(y-1,x-1);  

                upleftW.at<double>(y,x) = gammaDivSqrt2 * exp(-beta*diff.dot(diff));  

            }  

            else  

                upleftW.at<double>(y,x) = 0;  

            if( y-1>=0 ) // up  

            {  

                Vec3d diff = color - (Vec3d)img.at<Vec3b>(y-1,x);  

                upW.at<double>(y,x) = gamma * exp(-beta*diff.dot(diff));  

            }  

            else  

                upW.at<double>(y,x) = 0;  

            if( x+1<img.cols && y-1>=0 ) // upright  

            {  

                Vec3d diff = color - (Vec3d)img.at<Vec3b>(y-1,x+1);  

                uprightW.at<double>(y,x) = gammaDivSqrt2 * exp(-beta*diff.dot(diff));  

            }  

            else  

                uprightW.at<double>(y,x) = 0;  

        }  

    }  

}  

//檢查mask的正確性。mask爲通過用戶交互或者程序設定的，它是和圖像大小一樣的單通道灰度圖，  

//每個像素只能取GC_BGD or GC_FGD or GC_PR_BGD or GC_PR_FGD 四種枚舉值，分別表示該像素  

//（用戶或者程序指定）屬於背景、前景、可能爲背景或者可能爲前景像素。具體的參考：  

//ICCV2001“Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images”  

//Yuri Y. Boykov Marie-Pierre Jolly   

/* 

  Check size, type and element values of mask matrix. 

 */  

static void checkMask( const Mat& img, const Mat& mask )  

{  

    if( mask.empty() )  

        CV_Error( CV_StsBadArg, "mask is empty" );  

    if( mask.type() != CV_8UC1 )  

        CV_Error( CV_StsBadArg, "mask must have CV_8UC1 type" );  

    if( mask.cols != img.cols || mask.rows != img.rows )  

        CV_Error( CV_StsBadArg, "mask must have as many rows and cols as img" );  

    for( int y = 0; y < mask.rows; y++ )  

    {  

        for( int x = 0; x < mask.cols; x++ )  

        {  

            uchar val = mask.at<uchar>(y,x);  

            if( val!=GC_BGD && val!=GC_FGD && val!=GC_PR_BGD && val!=GC_PR_FGD )  

                CV_Error( CV_StsBadArg, "mask element value must be equel"  

                    "GC_BGD or GC_FGD or GC_PR_BGD or GC_PR_FGD" );  

        }  

    }  

}  

//通過用戶框選目標rect來創建mask，rect外的全部作爲背景，設置爲GC_BGD，  

//rect內的設置爲 GC_PR_FGD（可能爲前景）  

/* 

  Initialize mask using rectangular. 

*/  

static void initMaskWithRect( Mat& mask, Size imgSize, Rect rect )  

{  

    mask.create( imgSize, CV_8UC1 );  

    mask.setTo( GC_BGD );  

    rect.x = max(0, rect.x);  

    rect.y = max(0, rect.y);  

    rect.width = min(rect.width, imgSize.width-rect.x);  

    rect.height = min(rect.height, imgSize.height-rect.y);  

    (mask(rect)).setTo( Scalar(GC_PR_FGD) );  

}  

//通過k-means算法來初始化背景GMM和前景GMM模型  

/* 

  Initialize GMM background and foreground models using kmeans algorithm. 

*/  

static void initGMMs( const Mat& img, const Mat& mask, GMM& bgdGMM, GMM& fgdGMM )  

{  

    const int kMeansItCount = 10;  //迭代次數  

    const int kMeansType = KMEANS_PP_CENTERS; //Use kmeans++ center initialization by Arthur and Vassilvitskii  

    Mat bgdLabels, fgdLabels; //記錄背景和前景的像素樣本集中每個像素對應GMM的哪個高斯模型，論文中的kn  

    vector<Vec3f> bgdSamples, fgdSamples; //背景和前景的像素樣本集  

    Point p;  

    for( p.y = 0; p.y < img.rows; p.y++ )  

    {  

        for( p.x = 0; p.x < img.cols; p.x++ )  

        {  

            //mask中標記爲GC_BGD和GC_PR_BGD的像素都作爲背景的樣本像素  

            if( mask.at<uchar>(p) == GC_BGD || mask.at<uchar>(p) == GC_PR_BGD )  

                bgdSamples.push_back( (Vec3f)img.at<Vec3b>(p) );  

            else // GC_FGD | GC_PR_FGD  

                fgdSamples.push_back( (Vec3f)img.at<Vec3b>(p) );  

        }  

    }  

    CV_Assert( !bgdSamples.empty() && !fgdSamples.empty() );  

    //kmeans中參數_bgdSamples爲：每行一個樣本  

    //kmeans的輸出爲bgdLabels，裏面保存的是輸入樣本集中每一個樣本對應的類標籤（樣本聚爲componentsCount類後）  

    Mat _bgdSamples( (int)bgdSamples.size(), 3, CV_32FC1, &bgdSamples[0][0] );  

    kmeans( _bgdSamples, GMM::componentsCount, bgdLabels,  

            TermCriteria( CV_TERMCRIT_ITER, kMeansItCount, 0.0), 0, kMeansType );  

    Mat _fgdSamples( (int)fgdSamples.size(), 3, CV_32FC1, &fgdSamples[0][0] );  

    kmeans( _fgdSamples, GMM::componentsCount, fgdLabels,  

            TermCriteria( CV_TERMCRIT_ITER, kMeansItCount, 0.0), 0, kMeansType );  

    //經過上面的步驟後，每個像素所屬的高斯模型就確定的了，那麼就可以估計GMM中每個高斯模型的參數了。  

    bgdGMM.initLearning();  

    for( int i = 0; i < (int)bgdSamples.size(); i++ )  

        bgdGMM.addSample( bgdLabels.at<int>(i,0), bgdSamples[i] );  

    bgdGMM.endLearning();  

    fgdGMM.initLearning();  

    for( int i = 0; i < (int)fgdSamples.size(); i++ )  

        fgdGMM.addSample( fgdLabels.at<int>(i,0), fgdSamples[i] );  

    fgdGMM.endLearning();  

}  

//論文中：迭代最小化算法step 1：爲每個像素分配GMM中所屬的高斯模型，kn保存在Mat compIdxs中  

/* 

  Assign GMMs components for each pixel. 

*/  

static void assignGMMsComponents( const Mat& img, const Mat& mask, const GMM& bgdGMM,   

                                    const GMM& fgdGMM, Mat& compIdxs )  

{  

    Point p;  

    for( p.y = 0; p.y < img.rows; p.y++ )  

    {  

        for( p.x = 0; p.x < img.cols; p.x++ )  

        {  

            Vec3d color = img.at<Vec3b>(p);  

            //通過mask來判斷該像素屬於背景像素還是前景像素，再判斷它屬於前景或者背景GMM中的哪個高斯分量  

            compIdxs.at<int>(p) = mask.at<uchar>(p) == GC_BGD || mask.at<uchar>(p) == GC_PR_BGD ?  

                bgdGMM.whichComponent(color) : fgdGMM.whichComponent(color);  

        }  

    }  

}  

//論文中：迭代最小化算法step 2：從每個高斯模型的像素樣本集中學習每個高斯模型的參數  

/* 

  Learn GMMs parameters. 

*/  

static void learnGMMs( const Mat& img, const Mat& mask, const Mat& compIdxs, GMM& bgdGMM, GMM& fgdGMM )  

{  

    bgdGMM.initLearning();  

    fgdGMM.initLearning();  

    Point p;  

    for( int ci = 0; ci < GMM::componentsCount; ci++ )  

    {  

        for( p.y = 0; p.y < img.rows; p.y++ )  

        {  

            for( p.x = 0; p.x < img.cols; p.x++ )  

            {  

                if( compIdxs.at<int>(p) == ci )  

                {  

                    if( mask.at<uchar>(p) == GC_BGD || mask.at<uchar>(p) == GC_PR_BGD )  

                        bgdGMM.addSample( ci, img.at<Vec3b>(p) );  

                    else  

                        fgdGMM.addSample( ci, img.at<Vec3b>(p) );  

                }  

            }  

        }  

    }  

    bgdGMM.endLearning();  

    fgdGMM.endLearning();  

}  

//通過計算得到的能量項構建圖，圖的頂點爲像素點，圖的邊由兩部分構成，  

//一類邊是：每個頂點與Sink匯點t（代表背景）和源點Source（代表前景）連接的邊，  

//這類邊的權值通過Gibbs能量項的第一項能量項來表示。  

//另一類邊是：每個頂點與其鄰域頂點連接的邊，這類邊的權值通過Gibbs能量項的第二項能量項來表示。  

/* 

  Construct GCGraph 

*/  

static void constructGCGraph( const Mat& img, const Mat& mask, const GMM& bgdGMM, const GMM& fgdGMM, double lambda,  

                       const Mat& leftW, const Mat& upleftW, const Mat& upW, const Mat& uprightW,  

                       GCGraph<double>& graph )  

{  

    int vtxCount = img.cols*img.rows;  //頂點數，每一個像素是一個頂點  

    int edgeCount = 2*(4*vtxCount - 3*(img.cols + img.rows) + 2);  //邊數，需要考慮圖邊界的邊的缺失  

    //通過頂點數和邊數創建圖。這些類型聲明和函數定義請參考gcgraph.hpp  

    graph.create(vtxCount, edgeCount);  

    Point p;  

    for( p.y = 0; p.y < img.rows; p.y++ )  

    {  

        for( p.x = 0; p.x < img.cols; p.x++)  

        {  

            // add node  

            int vtxIdx = graph.addVtx();  //返回這個頂點在圖中的索引  

            Vec3b color = img.at<Vec3b>(p);  

            // set t-weights              

            //計算每個頂點與Sink匯點t（代表背景）和源點Source（代表前景）連接的權值。  

            //也即計算Gibbs能量（每一個像素點作爲背景像素或者前景像素）的第一個能量項  

            double fromSource, toSink;  

            if( mask.at<uchar>(p) == GC_PR_BGD || mask.at<uchar>(p) == GC_PR_FGD )  

            {  

                //對每一個像素計算其作爲背景像素或者前景像素的第一個能量項，作爲分別與t和s點的連接權值  

                fromSource = -log( bgdGMM(color) );  

                toSink = -log( fgdGMM(color) );  

            }  

            else if( mask.at<uchar>(p) == GC_BGD )  

            {  

                //對於確定爲背景的像素點，它與Source點（前景）的連接爲0，與Sink點的連接爲lambda  

                fromSource = 0;  

                toSink = lambda;  

            }  

            else // GC_FGD  

            {  

                fromSource = lambda;  

                toSink = 0;  

            }  

            //設置該頂點vtxIdx分別與Source點和Sink點的連接權值  

            graph.addTermWeights( vtxIdx, fromSource, toSink );  

            // set n-weights  n-links  

            //計算兩個鄰域頂點之間連接的權值。  

            //也即計算Gibbs能量的第二個能量項（平滑項）  

            if( p.x>0 )  

            {  

                double w = leftW.at<double>(p);  

                graph.addEdges( vtxIdx, vtxIdx-1, w, w );  

            }  

            if( p.x>0 && p.y>0 )  

            {  

                double w = upleftW.at<double>(p);  

                graph.addEdges( vtxIdx, vtxIdx-img.cols-1, w, w );  

            }  

            if( p.y>0 )  

            {  

                double w = upW.at<double>(p);  

                graph.addEdges( vtxIdx, vtxIdx-img.cols, w, w );  

            }  

            if( p.x<img.cols-1 && p.y>0 )  

            {  

                double w = uprightW.at<double>(p);  

                graph.addEdges( vtxIdx, vtxIdx-img.cols+1, w, w );  

            }  

        }  

    }  

}  

//論文中：迭代最小化算法step 3：分割估計：最小割或者最大流算法  

/* 

  Estimate segmentation using MaxFlow algorithm 

*/  

static void estimateSegmentation( GCGraph<double>& graph, Mat& mask )  

{  

    //通過最大流算法確定圖的最小割，也即完成圖像的分割  

    graph.maxFlow();  

    Point p;  

    for( p.y = 0; p.y < mask.rows; p.y++ )  

    {  

        for( p.x = 0; p.x < mask.cols; p.x++ )  

        {  

            //通過圖分割的結果來更新mask，即最後的圖像分割結果。注意的是，永遠都  

            //不會更新用戶指定爲背景或者前景的像素  

            if( mask.at<uchar>(p) == GC_PR_BGD || mask.at<uchar>(p) == GC_PR_FGD )  

            {  

                if( graph.inSourceSegment( p.y*mask.cols+p.x /*vertex index*/ ) )  

                    mask.at<uchar>(p) = GC_PR_FGD;  

                else  

                    mask.at<uchar>(p) = GC_PR_BGD;  

            }  

        }  

    }  

}  

//最後的成果：提供給外界使用的偉大的API：grabCut   

/* 

****參數說明： 

    img——待分割的源圖像，必須是8位3通道（CV_8UC3）圖像，在處理的過程中不會被修改； 

    mask——掩碼圖像，如果使用掩碼進行初始化，那麼mask保存初始化掩碼信息；在執行分割 

        的時候，也可以將用戶交互所設定的前景與背景保存到mask中，然後再傳入grabCut函 

        數；在處理結束之後，mask中會保存結果。mask只能取以下四種值： 

        GCD_BGD（=0），背景； 

        GCD_FGD（=1），前景； 

        GCD_PR_BGD（=2），可能的背景； 

        GCD_PR_FGD（=3），可能的前景。 

        如果沒有手工標記GCD_BGD或者GCD_FGD，那麼結果只會有GCD_PR_BGD或GCD_PR_FGD； 

    rect——用於限定需要進行分割的圖像範圍，只有該矩形窗口內的圖像部分才被處理； 

    bgdModel——背景模型，如果爲null，函數內部會自動創建一個bgdModel；bgdModel必須是 

        單通道浮點型（CV_32FC1）圖像，且行數只能爲1，列數只能爲13x5； 

    fgdModel——前景模型，如果爲null，函數內部會自動創建一個fgdModel；fgdModel必須是 

        單通道浮點型（CV_32FC1）圖像，且行數只能爲1，列數只能爲13x5； 

    iterCount——迭代次數，必須大於0； 

    mode——用於指示grabCut函數進行什麼操作，可選的值有： 

        GC_INIT_WITH_RECT（=0），用矩形窗初始化GrabCut； 

        GC_INIT_WITH_MASK（=1），用掩碼圖像初始化GrabCut； 

        GC_EVAL（=2），執行分割。 

*/  

void cv::grabCut( InputArray _img, InputOutputArray _mask, Rect rect,  

                  InputOutputArray _bgdModel, InputOutputArray _fgdModel,  

                  int iterCount, int mode )  

{  

    Mat img = _img.getMat();  

    Mat& mask = _mask.getMatRef();  

    Mat& bgdModel = _bgdModel.getMatRef();  

    Mat& fgdModel = _fgdModel.getMatRef();  

    if( img.empty() )  

        CV_Error( CV_StsBadArg, "image is empty" );  

    if( img.type() != CV_8UC3 )  

        CV_Error( CV_StsBadArg, "image mush have CV_8UC3 type" );  

    GMM bgdGMM( bgdModel ), fgdGMM( fgdModel );  

    Mat compIdxs( img.size(), CV_32SC1 );  

    if( mode == GC_INIT_WITH_RECT || mode == GC_INIT_WITH_MASK )  

    {  

        if( mode == GC_INIT_WITH_RECT )  

            initMaskWithRect( mask, img.size(), rect );  

        else // flag == GC_INIT_WITH_MASK  

            checkMask( img, mask );  

        initGMMs( img, mask, bgdGMM, fgdGMM );  

    }  

    if( iterCount <= 0)  

        return;  

    if( mode == GC_EVAL )  

        checkMask( img, mask );  

    const double gamma = 50;  

    const double lambda = 9*gamma;  

    const double beta = calcBeta( img );  

    Mat leftW, upleftW, upW, uprightW;  

    calcNWeights( img, leftW, upleftW, upW, uprightW, beta, gamma );  

    for( int i = 0; i < iterCount; i++ )  

    {  

        GCGraph<double> graph;  

        assignGMMsComponents( img, mask, bgdGMM, fgdGMM, compIdxs );  

        learnGMMs( img, mask, compIdxs, bgdGMM, fgdGMM );  

        constructGCGraph(img, mask, bgdGMM, fgdGMM, lambda, leftW, upleftW, upW, uprightW, graph );  

        estimateSegmentation( graph, mask );  

    }  

}
無敵三角貓
發佈了270 篇原創文章 · 獲贊 208 · 訪問量 136萬+
他的留言板關注
圖像分割之（四）OpenCV的GrabCut函數使用和源碼解讀

爲什麼我推薦ImageJ？

imagej邊緣提取

利用Matlab繪製圖像中的某一行或者某一列的灰度曲線

graph cut

一個很好的開源圖像處理軟件--imageJ (2

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結