Python 爲目標檢測任務繪製 ROC 和 PR 曲線

在評價一個檢測模型時通常需要繪製出其 ROC 曲線或 PR 曲線。本文利用 Python 實現了 ROC 和 PR 曲線的繪製，在 draw_curves 函數中讀取 .txt 文件即可一次性繪製出兩條曲線並輸出 AUC 和 mAP 值，適用於目標檢測任務，例如人臉檢測。獲取代碼請戳 GitHub 鏈接。

1 流程

爲目標檢測任務繪製曲線的流程如下：
1. 以檢測結果中每一個的 boundingbox 爲對象(記檢測出的 boundingbox 的個數爲 M)，去匹配該張圖片裏的每一個 groundtruth boundingbox，計算出交併比 (IoU)，並保留其中最大的值—— maxIoU，同時記錄下 confidence 分數。就得到了一個數組—— maxIoU_confidence，其長度等於 M，寬度爲 2，再按照 confidence 從大到小排序。
2. 設置一個閾值，一般取 0.5。當 maxIoU 大於閾值時，記爲 1，即 true positive；當 maxIoU 小於閾值時，記爲 0，即 false positve。這樣就得到了 tf_confidence，其尺寸不變，與 maxIoU_confidence 相同。
3. 從上到下截取數組 tf_confidence 的前 1，2，3，…，M 行，每次截取都得到一個子數組，子數組中 1 的個數即爲 tp，0 的個數即爲 fp，查全率 recall = tp / (tp + fp)，查準率 precision / TPR = tp / (groundtruth boundingbox 的個數)。每次截取得到一個點，這樣就一共得到 M 個點。以 fp 爲橫座標，TPR 爲縱座標繪製出 ROC 曲線；以 recall 爲橫座標，precision 爲縱座標繪製出 PR 曲線。

2 輸入

本程序需要讀入兩個分別記錄檢測結果和標準答案的 .txt 文件，記錄格式與 FDDB 的要求相同，即
...
image name i
number of faces in this image =im
face i1
face i2
...
face im
...
當檢測框爲矩形時， $f a c e i_{m}$ 爲左上角x 左上角y 寬高分數
例如

(當檢測框爲橢圓時，格式需要爲長軸半徑短軸半徑角度中心點x 中心點y 分數)

3 實現

本程序通過 draw_curves 函數實現曲線的繪製，輸入爲兩個 .txt 文件和可選的 IoU 設定閾值，代碼如下：

def draw_curves(resultsfile, groundtruthfile, show_images = False, threshold = 0.5):
    """
    讀取包含檢測結果和標準答案的兩個.txt文件, 畫出ROC曲線和PR曲線
    :param resultsfile: 包含檢測結果的.txt文件
    :param groundtruthfile: 包含標準答案的.txt文件
    :param show_images: 是否顯示圖片, 若需可視化, 需修改Calculate.match中的代碼, 找到存放圖片的路徑
    :param threshold: IoU閾值
    """
    maxiou_confidence, num_detectedbox, num_groundtruthbox = match(resultsfile, groundtruthfile, show_images)
    tf_confidence = thres(maxiou_confidence, threshold)
    plot(tf_confidence, num_groundtruthbox)


draw_curves("results.txt", "ellipseList.txt")

其中函數match, thres, plot 實現的功能分別對應流程中的 1, 2, 3 點，代碼實現見 3.1 至 3.3。3.4 至 3.6 節介紹了其他功能模塊函數。

3.1 匹配

def match(resultsfile, groundtruthfile, show_images):
    """
    匹配檢測框和標註框, 爲每一個檢測框得到一個最大交併比   
    :param resultsfile: 包含檢測結果的.txt文件
    :param groundtruthfile: 包含標準答案的.txt文件
    :param show_images: 是否顯示圖片
    :return maxiou_confidence: np.array, 存放所有檢測框對應的最大交併比和置信度
    :return num_detectedbox: int, 檢測框的總數
    :return num_groundtruthbox: int, 標註框的總數
    """
    results, num_detectedbox = load(resultsfile)
    groundtruth, num_groundtruthbox = load(groundtruthfile)

    assert len(results) == len(groundtruth), "數量不匹配: 標準答案中圖片數量爲%d, 而檢測結果中圖片數量爲%d" % (
    len(groundtruth), len(results))

    maxiou_confidence = np.array([])

    for i in range(len(results)):

        print(results[i][0])

        if show_images: # 若需可視化
            fname = './' + results[i][0] + '.jpg' # 若需可視化, 修改這裏爲存放圖片的路徑
            image = cv2.imread(fname)

        for j in range(2, len(results[i])): # 對於一張圖片中的每一個檢測框

            iou_array = np.array([])
            detectedbox = results[i][j]
            confidence = detectedbox[-1]

            if show_images: # 若需可視化
                x_min, y_min = int(detectedbox[0]), int(detectedbox[1])
                x_max = int(detectedbox[0] + detectedbox[2])
                y_max = int(detectedbox[1] + detectedbox[3])
                cv2.rectangle(image, (x_min, y_min), (x_max, y_max), (255, 0, 0), 2)

            for k in range(2, len(groundtruth[i])): # 去匹配這張圖片中的每一個標註框
                groundtruthbox = groundtruth[i][k]
                iou = cal_IoU(detectedbox, groundtruthbox)
                iou_array = np.append(iou_array, iou) # 得到一個交併比的數組

                if show_images: # 若需可視化
                    x_min, y_min = int(groundtruthbox[0]), int(groundtruthbox[1])
                    x_max = int(groundtruthbox[0] + groundtruthbox[2])
                    y_max = int(groundtruthbox[1] + groundtruthbox[3])
                    cv2.rectangle(image, (x_min, y_min), (x_max, y_max), (0, 255, 0), 2)

            maxiou = np.max(iou_array) #最大交併比
            maxiou_confidence = np.append(maxiou_confidence, [maxiou, confidence])

        if show_images: # 若需可視化
            cv2.imshow("Image",image)
            cv2.waitKey()

    maxiou_confidence = maxiou_confidence.reshape(-1, 2)
    maxiou_confidence = maxiou_confidence[np.argsort(-maxiou_confidence[:, 1])] # 按置信度從大到小排序

    return maxiou_confidence, num_detectedbox, num_groundtruthbox

3.2 閾值劃分

def thres(maxiou_confidence, threshold = 0.5):
    """
    將大於閾值的最大交併比記爲1, 反正記爲0
    :param maxiou_confidence: np.array, 存放所有檢測框對應的最大交併比和置信度
    :param threshold: 閾值
    :return tf_confidence: np.array, 存放所有檢測框對應的tp或fp和置信度
    """
    maxious = maxiou_confidence[:, 0]
    confidences = maxiou_confidence[:, 1]
    true_or_flase = (maxious > threshold)
    tf_confidence = np.array([true_or_flase, confidences])
    tf_confidence = tf_confidence.T
    tf_confidence = tf_confidence[np.argsort(-tf_confidence[:, 1])]
    return tf_confidence

3.3 畫圖

def plot(tf_confidence, num_groundtruthbox):
    """
    從上到下截取tf_confidence, 計算並畫圖
    :param tf_confidence: np.array, 存放所有檢測框對應的tp或fp和置信度
    :param num_groundtruthbox: int, 標註框的總數
    """
    fp_list = []
    recall_list = []
    precision_list = []
    auc = 0
    mAP = 0
    for num in range(len(tf_confidence)):
        arr = tf_confidence[:(num + 1), 0] # 截取, 注意要加1
        tp = np.sum(arr)
        fp = np.sum(arr == 0)
        recall = tp / num_groundtruthbox
        precision = tp / (tp + fp)
        auc = auc + recall
        mAP = mAP + precision

        fp_list.append(fp)
        recall_list.append(recall)
        precision_list.append(precision)

    auc = auc / len(fp_list)
    mAP = mAP * max(recall_list) / len(recall_list)

    plt.figure()
    plt.title('ROC')
    plt.xlabel('False Positives')
    plt.ylabel('True Positive rate')
    plt.ylim(0, 1)
    plt.plot(fp_list, recall_list, label = 'AUC: ' + str(auc))
    plt.legend()

    plt.figure()
    plt.title('Precision-Recall')
    plt.xlabel('Recall')
    plt.ylabel('Precision')
    plt.axis([0, 1, 0, 1])
    plt.plot(recall_list, precision_list, label = 'mAP: ' + str(mAP))
    plt.legend()

    plt.show()

3.4 將橢圓檢測框轉化爲矩形框

常用的檢測框樣式有橢圓框和矩形框兩種，FDDB 的標準答案就是用橢圓框標註的。爲了計算交併比，編寫了將橢圓框轉換爲矩形框的函數，這裏只是一個近似，有待尋找更好的轉換方法或者計算橢圓和矩形交併比的方法。

def ellipse_to_rect(ellipse):
    """
    將橢圓框轉換爲水平豎直的矩形框
    :param ellipse: list, [major_axis_radius minor_axis_radius angle center_x center_y, score]
    :return rect: list, [leftx, topy, width, height, score]
    """
    major_axis_radius, minor_axis_radius, angle, center_x, center_y, score = ellipse
    leftx = center_x - minor_axis_radius
    topy = center_y - major_axis_radius
    width = 2 * minor_axis_radius
    height = 2 * major_axis_radius
    rect = [leftx, topy, width, height, score]
    return rect

3.5 計算交併比

def cal_IoU(detectedbox, groundtruthbox):
    """
    計算兩個水平豎直的矩形的交併比
    :param detectedbox: list, [leftx_det, topy_det, width_det, height_det, confidence]
    :param groundtruthbox: list, [leftx_gt, topy_gt, width_gt, height_gt, 1]
    :return iou: 交併比
    """
    leftx_det, topy_det, width_det, height_det, _ = detectedbox
    leftx_gt, topy_gt, width_gt, height_gt, _ = groundtruthbox

    centerx_det = leftx_det + width_det / 2
    centerx_gt = leftx_gt + width_gt / 2
    centery_det = topy_det + height_det / 2
    centery_gt = topy_gt + height_gt / 2

    distancex = abs(centerx_det - centerx_gt) - (width_det + width_gt) / 2
    distancey = abs(centery_det - centery_gt) - (height_det + height_gt) / 2

    if distancex <= 0 and distancey <= 0:
        intersection = distancex * distancey
        union = width_det * height_det + width_gt * height_gt - intersection
        iou = intersection / union
        print(iou)
        return iou
    else:
        return 0

3.6 讀取 .txt 文件

def load(txtfile):
    '''
    讀取檢測結果或 groundtruth 的文檔, 若爲橢圓座標, 轉換爲矩形座標
    :param txtfile: 讀入的.txt文件, 格式要求與FDDB相同
    :return imagelist: list, 每張圖片的信息單獨爲一行, 第一列是圖片名稱, 第二列是人臉個數, 後面的列均爲列表, 包含4個矩形座標和1個分數
    :return num_allboxes: int, 矩形框的總個數
    '''
    imagelist = [] # 包含所有圖片的信息的列表

    txtfile = open(txtfile, 'r')
    lines = txtfile.readlines() # 一次性全部讀取, 得到一個list

    num_allboxes = 0
    i = 0
    while i < len(lines): # 在lines中循環一遍
        image = [] # 包含一張圖片信息的列表
        image.append(lines[i].strip()) # 去掉首尾的空格和換行符, 向image中寫入圖片名稱
        num_faces = int(lines[i + 1])
        num_allboxes = num_allboxes + num_faces
        image.append(num_faces) # 向image中寫入人臉個數

        if num_faces > 0:
            for num in range(num_faces):
                boundingbox = lines[i + 2 + num].strip() # 去掉首尾的空格和換行符
                boundingbox = boundingbox.split() # 按中間的空格分割成多個元素
                boundingbox = list(map(float, boundingbox)) # 轉換成浮點數列表

                if len(boundingbox) == 6: # 如果是橢圓座標
                    boundingbox = ellipse_to_rect(boundingbox) # 則轉換爲矩形座標

                image.append(boundingbox) # 向image中寫入包含矩形座標和分數的浮點數列表

        imagelist.append(image) # 向imagelist中寫入一張圖片的信息

        i = i + num_faces + 2 # 增加index至下張圖片開始的行數

    txtfile.close()

    return imagelist, num_allboxes