RANSAC隨機採樣一致算法

RANSAC簡介

RANSAC(RAndom SAmple Consensus,隨機採樣一致)算法是從一組含有“外點”(outliers)的數據中正確估計數學模型參數的迭代算法。“外點”一般指的的數據中的噪聲,比如說匹配中的誤匹配和估計曲線中的離羣點。所以,RANSAC也是一種“外點”檢測算法。RANSAC算法是一種不確定算法,它只能在一種概率下產生結果,並且這個概率會隨着迭代次數的增加而加大(之後會解釋爲什麼這個算法是這樣的)。RANSAC算最早是由Fischler和Bolles在SRI上提出用來解決LDP(Location Determination Proble)問題的。

對於RANSAC算法來說一個基本的假設就是數據是由“內點”和“外點”組成的。“內點”就是組成模型參數的數據,“外點”就是不適合模型的數據。同時RANSAC假設:在給定一組含有少部分“內點”的數據,存在一個程序可以估計出符合“內點”的模型。

算法基本思想和流程

RANSAC是通過反覆選擇數據集去估計出模型,一直迭代到估計出認爲比較好的模型。
具體的實現步驟可以分爲以下幾步:

  1. 選擇出可以估計出模型的最小數據集;(對於直線擬合來說就是兩個點,對於計算Homography矩陣就是4個點)
  2. 使用這個數據集來計算出數據模型;
  3. 將所有數據帶入這個模型,計算出“內點”的數目;(累加在一定誤差範圍內的適合當前迭代推出模型的數據)
  4. 比較當前模型和之前推出的最好的模型的“內點“的數量,記錄最大“內點”數的模型參數和“內點”數;
  5. 重複1-4步,直到迭代結束或者當前模型已經足夠好了(“內點數目大於一定數量”)。
# -*- coding: utf-8 -*-
"""
Created on Mon Jul 30 20:07:19 2018

@author: Yuki
"""
import random
import numpy as np
from matplotlib import pyplot as plt

# Magic Numbers
# Controls the inlier range
THRESHOLD = 0.1

# Finds random potential fit lines
def RANSAC(data):
    n = len(data)

    # Another magic number
    NUM_TRIALS = n // 2

    best_in_count = 0
    for i in range(0, NUM_TRIALS):
        r = random.sample(data, 2)
        r = np.array(r)

        # linear regression on two points will just give the line through both points
        m, b = lin_reg(r)

        # finds the line with the most inliers
        in_count = 0
        for j in data:
            # if the distance between the line and point is less than or equal to THRESHOLD it is an inlier
            if abs(j[1] - ((m * j[0]) + b)) <= THRESHOLD:
                in_count = in_count + 1
        # Tracks the best fit line so far
        if in_count > best_in_count:
            best_in_count = in_count
            best_m = m
            best_b = b

    # record both inliers and outliers to make end graph pretty
    in_line = []
    out_line = []
    for j in data:
        if abs(j[1] - ((best_m * j[0]) + best_b)) <= THRESHOLD:
            in_line.append(j)
        else:
            out_line.append(j)

    # returns two lists, inliers and outliers
    return in_line, out_line

# performs the linear regression as described on the assignment sheet
def lin_reg(data):
    n = float(len(data))
    x_sum = 0.0
    y_sum = 0.0

    # averages the x and y values
    for i in data:
        x_sum = x_sum + i[0]
        y_sum = y_sum + i[1]
    x_average = x_sum / n
    y_average = y_sum / n

    # initializes slope numerator and denominator
    # note denominator should not be zero with data
    m_numerator = 0.0
    m_denominator = 0.0

    # calculates the slope
    for i in data:
        m_numerator = m_numerator + ((i[0] - x_average)*(i[1] - y_average))
        m_denominator = m_denominator + ((i[0] - x_average)*(i[0] - x_average))
    m = m_numerator / m_denominator

    # finds the intercept
    b = y_average - (m * x_average)

    # returns slope and intercept
    return m, b

def plot_best_fit(data):

    # Get our inlier and outlier points
    in_line, out_line = RANSAC(data)

    # find the best fit line for inliers
    m, b = lin_reg(in_line)
    
    # This was the hardest part
    # Could not find a function that would make a non line segment so I just covered our domain
    # Admittedly with potential error on giant domains
    x_min = 100000.0
    x_max = -100000.0
    for i in data:
        if i[0] > x_max:
            x_max = i[0]
        if i[0] < x_min:
            x_min = i[0]
    domain = [x_min, x_max]
    line_points = [m * i + b for i in domain]
    line_points_top= [m * i + 0.5 * b for i in domain]
    line_points_bottom = [m * i + 1.2 * b for i in domain]
    
    # Plot the inliers as blue dots
    in_line = np.array(in_line)
    x, y = in_line.T
    plt.scatter(x, y)

    # plot the outliers as red x's
    # if statement for if outliers is empty, which it is for the easy case
    if out_line != []:
        out_line = np.array(out_line)
        x, y = out_line.T
        plt.scatter(x, y, s=30, c='r', marker='x')

    # plot our best fit line
    plt.plot(domain, line_points, '-')    
    plt.plot(domain, line_points_bottom, '-')    
    plt.plot(domain, line_points_top, '-')    
    plt.gca().invert_yaxis()
    # show the plot
    plt.title("Road-Line-Estimation")
    plt.xlabel('1/X')
    plt.ylabel('Laser')
    plt.show()
    
    # return slope and intercept for answers
    return m, b

# ----------------------------------------------------------------------------------------------------
#測試栗子
'''

data = []

with open('noisy_data_medium.txt') as file:
    # Creates 2D array to hold the data
    for l in file:
        data.append(l.split())

    # removes comma from first entry
    for i in data:
        i[0] = float(i[0][:-1])
        i[1] = float(i[1])

# function also returns slop and intercept should you want them
m, b = plot_best_fit(data)
print(m, b)
'''

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章