圖片仿射變換原理與實現

原創

mazinkaiser1991

2019-10-26 04:53

圖像仿射變換共有“旋轉”、“平移”、“錯切（shear）”、“縮放”、“翻轉”5種。本文結合keras-retinanet的實現進行分析。之所以採用keras-retinanet進行分析，是因爲該實現較爲典型，比較容易理解。

keras-retinanet官方地址：https://github.com/fizyr/keras-retinanet.git

以上五種仿射變換位於utils/transform.py中。仿射變換在代碼中被用於目標檢測任務的圖像倍增。（PS：其實只有平移、縮放、翻轉可以用於目標檢測任務，因爲旋轉與錯切後物體的boundingbox可能變大，我認爲這可能造成boundingbox迴歸任務不準確）

1.旋轉，相較於opencv實現的圖片旋轉，retinanet中自帶的圖片實現更爲簡單，更多的應該是從效率角度考慮。使用numpy實現可以同時處理多組圖片。但opencv的圖片旋轉更爲複雜，除了圍繞圖片中心旋轉外，還可以圍繞圖片任意一點旋轉，並調整縮放比例。

def rotation(angle):
    """ Construct a homogeneous 2D rotation matrix.
    Args
        angle: the angle in radians
    Returns
        the rotation matrix as 3 by 3 numpy array
    """
    return np.array([
        [np.cos(angle), -np.sin(angle), 0],
        [np.sin(angle),  np.cos(angle), 0],
        [0, 0, 1]
    ])

其實僅需要2*2矩陣既可以解決，使用3*3矩陣爲將旋轉矩陣表示爲齊次形式。

2.平移

def translation(translation):
    """ Construct a homogeneous 2D translation matrix.
    # Arguments
        translation: the translation 2D vector
    # Returns
        the translation matrix as 3 by 3 numpy array
    """
    return np.array([
        [1, 0, translation[0]],
        [0, 1, translation[1]],
        [0, 0, 1]
    ])

3.錯切

def shear(angle):
    """ Construct a homogeneous 2D shear matrix.
    Args
        angle: the shear angle in radians
    Returns
        the shear matrix as 3 by 3 numpy array
    """
    return np.array([
        [1, -np.sin(angle), 0],
        [0,  np.cos(angle), 0],
        [0, 0, 1]
    ])

4.縮放

def scaling(factor):
    """ Construct a homogeneous 2D scaling matrix.
    Args
        factor: a 2D vector for X and Y scaling
    Returns
        the zoom matrix as 3 by 3 numpy array
    """
    return np.array([
        [factor[0], 0, 0],
        [0, factor[1], 0],
        [0, 0, 1]
    ])

5.翻轉

翻轉同樣是用scaling實現的，直接與“+1/-1”相乘即可以實現翻轉。

def random_flip(flip_x_chance, flip_y_chance, prng=DEFAULT_PRNG):
    """ Construct a transformation randomly containing X/Y flips (or not).
    Args
        flip_x_chance: The chance that the result will contain a flip along the X axis.
        flip_y_chance: The chance that the result will contain a flip along the Y axis.
        prng:          The pseudo-random number generator to use.
    Returns
        a homogeneous 3 by 3 transformation matrix
    """
    flip_x = prng.uniform(0, 1) < flip_x_chance
    flip_y = prng.uniform(0, 1) < flip_y_chance
    # 1 - 2 * bool gives 1 for False and -1 for True.
    return scaling((1 - 2 * flip_x, 1 - 2 * flip_y))

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

圖片仿射變換原理與實現

Faster RCNN交替訓練法詳解

Faster RCNN AnchorTargetLayer ProposalLayer ProposalTargetLayer詳細對比

Faster RCNN近似端到端法詳解

對numpy axis（軸）的理解

keras.utils.Sequence使用注意事項

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結