FSAF 模型源碼解讀(tensorflow+keras )

FSAF 核心思想

    FSAF全名:Feature Selective Anchor-Free Module for Single-Shot Object Detection。

    FSAF 模型創新點在於無錨框特徵選擇模塊(FSAF),設計了無錨框分支網絡,訓練過程使得模型能在金字塔特徵層中選擇合適的特徵層,進行迴歸與分類,而現有模型大多基於featrue map size大小設定anchor大小,然後進行迴歸於分類。FSAF 模塊可以添加於所有單步類FPN網絡,如RetinaNet, YOLOv3等。整體網絡結構爲論文中Figure 4:

網絡結構

    FSAF 在線特徵選擇模塊思路:首先實例分配到所有特徵層,計算所有特徵層中focal loss, ioU losss,選取所有層中兩者後最小的層作爲特徵層,在選擇特徵層上計算分類於迴歸損失以進行訓練,損失的計算作者設計了有效區域(efficinet region),以及忽視區域(ignore region),計算損失時只考慮有效區域的損失。見論文Figure 6所示。 

在這裏插入圖片描述

   這樣的設計方式,本人認爲更加符合邏輯推理。同時作者也給出了對比,在大部分情況下,自動選擇的特徵與基於featrue map大小選擇的特徵層相同,但是也有不同的層差異。具體可以參看論文中fig.8對比結果。

在這裏插入圖片描述

源碼解讀

   本文就FSAF 源碼解讀進行記錄,一方面可以增加自己對論文的理解,也可以學習優秀代碼提升代碼能力。該源碼以tensorflow+keras爲框架,在Retinanet,YOLOv3 基礎上結合FSAF模塊的代碼,上述兩個網絡的代碼基於keras進行搭建,具體可以參照源碼。解讀部分重點記錄FSAF模塊,包括特徵選擇層,損失函數計算層,推理部分的bounding box 的計算方法。

源碼地址:https://github.com/xuannianz/FSAF.git


def fsaf(
        inputs,
        backbone_layers,
        num_classes,
        create_pyramid_features=__create_pyramid_features,
        name='fsaf'
):
    """
    Construct a RetinaNet model on top of a backbone.

    This model is the minimum model necessary for training (with the unfortunate exception of anchors as output).

    Args
        inputs: keras.layers.Input (or list of) for the input to the model.
        num_classes: Number of classes to classify.
        num_anchors: Number of base anchors.
        create_pyramid_features : Functor for creating pyramid features given the features C3, C4, C5 from the backbone.
        submodels: Submodels to run on each feature map (default is regression and classification submodels).
        name: Name of the model.

    Returns
        A keras.models.Model which takes an image as input and outputs generated anchors and the result from each submodel on every pyramid level.

        The order of the outputs is as defined in submodels:
        ```
        [
            regression, classification, other[0], other[1], ...
        ]
        ```
    """
    image_input = inputs[0]
    gt_boxes_input = inputs[1]
    feature_shapes_input = inputs[2]
    submodels = default_fsaf_submodels(num_classes)

    C3, C4, C5 = backbone_layers

    # compute pyramid features as per https://arxiv.org/abs/1708.02002
    # [P3, P4, P5, P6, P7]
    features = create_pyramid_features(C3, C4, C5)
    # for all pyramid levels, run available submodels
    # [(b, sum(fh*fw), 4), (b, sum(fh*fw), num_classes)]
    batch_regr_pred, batch_cls_pred = __build_fsaf_pyramid(submodels, features)
    batch_gt_box_levels = LevelSelect(name='level_select')(
        [batch_cls_pred, batch_regr_pred, feature_shapes_input, gt_boxes_input])
    batch_cls_target, batch_cls_mask, batch_cls_num_pos, batch_regr_target, batch_regr_mask = FSAFTarget(
        num_classes=num_classes,
        name='fsaf_target')(
        [batch_gt_box_levels, feature_shapes_input, gt_boxes_input])
    focal_loss_graph = focal_with_mask()
    iou_loss_graph = iou_with_mask()
    cls_loss = keras.layers.Lambda(focal_loss_graph,
                                   output_shape=(1,),
                                   name="cls_loss")(
        [batch_cls_target, batch_cls_pred, batch_cls_mask, batch_cls_num_pos])
    regr_loss = keras.layers.Lambda(iou_loss_graph,
                                    output_shape=(1,),
                                    name="regr_loss")([batch_regr_target, batch_regr_pred, batch_regr_mask])
    return keras.models.Model(inputs=inputs,
                              outputs=[cls_loss, regr_loss, batch_cls_pred, batch_regr_pred],
                              name=name)

該函數在backbone基礎上添加了FSAF模塊。

1.features = create_pyramid_features(C3, C4, C5) 創建金字塔特徵層。

2.batch_regr_pred, batch_cls_pred = __build_fsaf_pyramid(submodels, features)在金字塔層添加bounding box、分類子模塊。

3 batch_gt_box_levels = LevelSelect(name='level_select') 構建在線特徵選擇層。

4.batch_regr_mask = FSAFTarget(
        num_classes=num_classes,
        name='fsaf_target')

根據選擇的特徵層構建監督信號,進行訓練。即根據選定的層生成包含 efficient region區域,計算損失。

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章