使用Python+OpenCV進行數據增廣方法綜述（附代碼演練）

點擊下面卡片關注“AI算法與圖像處理”，選擇加"星標"或“置頂”

重磅乾貨，第一時間送達

數據擴充是一種增加數據集多樣性的技術，無需收集更多的真實數據，但仍然有助於提高模型的準確性和防止模型過度擬合。在這篇文章中，你將學習使用Python和OpenCV實現最流行和最有效的對象檢測任務的數據擴充過程。

介紹的數據擴充方法包括:

隨機剪裁
Cutout
ColorJitter
添加噪聲
過濾

首先，讓我們導入幾個庫並準備一些必要的子例程。


   
   
   
 
    
    
    import os
import cv2
import numpy as np
import random

def file_lines_to_list(path):
    '''
    ### Convert Lines in TXT File to List ###
    path: path to file
    '''
    with open(path) as f:
        content = f.readlines()
    content = [(x.strip()).split() for x in content]
    return content

def get_file_name(path):
    '''
    ### Get Filename of Filepath ###
    path: path to file
    '''
    basename = os.path.basename(path)
    onlyname = os.path.splitext(basename)[0]
    return onlyname

def write_anno_to_txt(boxes, filepath):
    '''
    ### Write Annotation to TXT File ###
    boxes: format [[obj x1 y1 x2 y2],...]
    filepath: path/to/file.txt
    '''
    txt_file = open(filepath, "w")
    for box in boxes:
        print(box[0], int(box[1]), int(box[2]), int(box[3]), int(box[4]), file=txt_file)
    txt_file.close()

下圖在本文中用作示例圖像。

隨機剪裁

隨機剪裁：隨機選擇一個區域並將其裁剪出來，形成一個新的數據樣本，被裁剪的區域應與原始圖像具有相同的寬高比，以保持對象的形狀。

在上圖中，左邊的圖像是帶有真實邊界框的原始圖像(紅色部分)，右邊的圖像是通過裁剪橙色框中的區域創建的新樣本。

在新樣本的標註中，去除所有與左側圖像中橙色框不重疊的對象，並將橙色框邊界上的對象的座標進行細化，使之與新樣本相匹配。對原始圖像進行隨機裁剪的輸出是新的裁剪後的圖像及其註釋。


   
   
   
 
    
    
    def randomcrop(img, gt_boxes, scale=0.5):
    '''
    ### Random Crop ###
    img: image
    gt_boxes: format [[obj x1 y1 x2 y2],...]
    scale: percentage of cropped area
    '''
    
    # Crop image
    height, width = int(img.shape[0]*scale), int(img.shape[1]*scale)
    x = random.randint(0, img.shape[1] - int(width))
    y = random.randint(0, img.shape[0] - int(height))
    cropped = img[y:y+height, x:x+width]
    resized = cv2.resize(cropped, (img.shape[1], img.shape[0]))
    
    # Modify annotation
    new_boxes=[]
    for box in gt_boxes:
        obj_name = box[0]
        x1 = int(box[1])
        y1 = int(box[2])
        x2 = int(box[3])
        y2 = int(box[4])
        x1, x2 = x1-x, x2-x
        y1, y2 = y1-y, y2-y
        x1, y1, x2, y2 = x1/scale, y1/scale, x2/scale, y2/scale
        if (x1<img.shape[1] and y1<img.shape[0]) and (x2>0 and y2>0):
            if x1<0: x1=0
            if y1<0: y1=0
            if x2>img.shape[1]: x2=img.shape[1]
            if y2>img.shape[0]: y2=img.shape[0]
            new_boxes.append([obj_name, x1, y1, x2, y2])
    return resized, new_boxes

Cutout

Cutout是2017年由Terrance DeVries和Graham W. Taylor在他們的論文中介紹的，是一種簡單的正則化技術，在訓練過程中隨機掩蓋輸入的正方形區域，可以用來提高卷積神經網絡的魯棒性和整體性能。這種方法不僅非常容易實現，而且表明它可以與現有形式的數據擴充和其他正則化器一起使用，進一步提高模型的性能。

論文地址： https://arxiv.org/abs/1708.04552

與本文一樣，我們使用了cutout來提高圖像識別(分類)的精度，因此，如果我們將相同的方案部署到目標檢測數據集中，可能會導致丟失目標（特別是小目標）的問題。在下圖中，刪除了剪切區域（黑色區域）內的大量小對象，這不符合數據增強的精神。

爲了使這種方式適合對象檢測，我們可以做一個簡單的修改，而不是僅使用一個遮罩並將其放置在圖像中的隨機位置，而是隨機選擇一半的對象，並將裁剪應用於每個目標區域，效果更佳。增強後的圖像如下圖所示。

Cutout的輸出是一個新生成的圖像，我們不刪除對象或改變圖像大小，那麼生成的圖像的註釋就是原始註釋。


   
   
   
 
    
    
    def cutout(img, gt_boxes, amount=0.5):
    '''
    ### Cutout ###
    img: image
    gt_boxes: format [[obj x1 y1 x2 y2],...]
    amount: num of masks / num of objects 
    '''
    out = img.copy()
    ran_select = random.sample(gt_boxes, round(amount*len(gt_boxes)))

    for box in ran_select:
        x1 = int(box[1])
        y1 = int(box[2])
        x2 = int(box[3])
        y2 = int(box[4])
        mask_w = int((x2 - x1)*0.5)
        mask_h = int((y2 - y1)*0.5)
        mask_x1 = random.randint(x1, x2 - mask_w)
        mask_y1 = random.randint(y1, y2 - mask_h)
        mask_x2 = mask_x1 + mask_w
        mask_y2 = mask_y1 + mask_h
        cv2.rectangle(out, (mask_x1, mask_y1), (mask_x2, mask_y2), (0, 0, 0), thickness=-1)
    return out

ColorJitter

ColorJitter是另一種簡單的圖像數據擴充類型，我們隨機改變圖像的亮度、對比度和飽和度。我相信這個“傢伙”很容易被大多數讀者理解。


   
   
   
 
    
    
    def colorjitter(img, cj_type="b"):
    '''
    ### Different Color Jitter ###
    img: image
    cj_type: {b: brightness, s: saturation, c: constast}
    '''
    if cj_type == "b":
        # value = random.randint(-50, 50)
        value = np.random.choice(np.array([-50, -40, -30, 30, 40, 50]))
        hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
        h, s, v = cv2.split(hsv)
        if value >= 0:
            lim = 255 - value
            v[v > lim] = 255
            v[v <= lim] += value
        else:
            lim = np.absolute(value)
            v[v < lim] = 0
            v[v >= lim] -= np.absolute(value)

        final_hsv = cv2.merge((h, s, v))
        img = cv2.cvtColor(final_hsv, cv2.COLOR_HSV2BGR)
        return img
    
    elif cj_type == "s":
        # value = random.randint(-50, 50)
        value = np.random.choice(np.array([-50, -40, -30, 30, 40, 50]))
        hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
        h, s, v = cv2.split(hsv)
        if value >= 0:
            lim = 255 - value
            s[s > lim] = 255
            s[s <= lim] += value
        else:
            lim = np.absolute(value)
            s[s < lim] = 0
            s[s >= lim] -= np.absolute(value)

        final_hsv = cv2.merge((h, s, v))
        img = cv2.cvtColor(final_hsv, cv2.COLOR_HSV2BGR)
        return img
    
    elif cj_type == "c":
        brightness = 10
        contrast = random.randint(40, 100)
        dummy = np.int16(img)
        dummy = dummy * (contrast/127+1) - contrast + brightness
        dummy = np.clip(dummy, 0, 255)
        img = np.uint8(dummy)
        return img

添加噪聲

通常，噪聲被認爲是圖像中不可預料的因素，然而，有幾種類型的噪聲(如高斯噪聲、椒鹽噪聲)可以用於數據擴充，在深度學習中，添加噪聲是一種非常簡單而有益的數據擴充方法。在下面的例子中，爲了增強數據，將高斯噪聲和椒鹽噪聲添加到原始圖像中。

對於那些無法識別高斯噪聲和椒鹽噪聲區別的人，高斯噪聲的取值範圍取決於配置，從0到255，因此，在RGB圖像中，高斯噪聲像素可以是任何顏色。相反，椒鹽噪聲像素只能有兩個值：0或255，分別爲黑色(椒)或白色(鹽)。


   
   
   
 
    
    
    def noisy(img, noise_type="gauss"):
    '''
    ### Adding Noise ###
    img: image
    cj_type: {gauss: gaussian, sp: salt & pepper}
    '''
    if noise_type == "gauss":
        image=img.copy() 
        mean=0
        st=0.7
        gauss = np.random.normal(mean,st,image.shape)
        gauss = gauss.astype('uint8')
        image = cv2.add(image,gauss)
        return image
    
    elif noise_type == "sp":
        image=img.copy() 
        prob = 0.05
        if len(image.shape) == 2:
            black = 0
            white = 255            
        else:
            colorspace = image.shape[2]
            if colorspace == 3:  # RGB
                black = np.array([0, 0, 0], dtype='uint8')
                white = np.array([255, 255, 255], dtype='uint8')
            else:  # RGBA
                black = np.array([0, 0, 0, 255], dtype='uint8')
                white = np.array([255, 255, 255, 255], dtype='uint8')
        probs = np.random.random(image.shape[:2])
        image[probs < (prob / 2)] = black
        image[probs > 1 - (prob / 2)] = white
        return image

過濾

本文介紹的最後一個數據擴充過程是過濾。與添加噪聲類似，過濾也很簡單，易於實現。在實現中使用的三種濾波類型包括模糊(均值)、高斯和中值。


   
   
   
 
    
    
    def filters(img, f_type = "blur"):
    '''
    ### Filtering ###
    img: image
    f_type: {blur: blur, gaussian: gaussian, median: median}
    '''
    if f_type == "blur":
        image=img.copy()
        fsize = 9
        return cv2.blur(image,(fsize,fsize))
    
    elif f_type == "gaussian":
        image=img.copy()
        fsize = 9
        return cv2.GaussianBlur(image, (fsize, fsize), 0)
    
    elif f_type == "median":
        image=img.copy()
        fsize = 9
        return cv2.medianBlur(image, fsize)

總結

在這篇文章中，主要向大家介紹了一個關於對象檢測任務中數據擴充實現的教程。你們可以在這裏找到完整實現。

https://github.com/tranleanh/data-augmentation

  
       
       
       
   
        
        
        個人微信（如果沒有備註不拉羣！）
  
       
       
       
  
       
       
       
   
        
        
        請註明：
   
        
        
        地區+學校/企業+研究方向+暱稱
  
       
       
       
  
       
       
       
   
        
        
        

  
       
       
       


下載1：何愷明頂會分享

在「AI算法與圖像處理」公衆號後臺回覆：何愷明，即可下載。總共有6份PDF，涉及 ResNet、Mask RCNN等經典工作的總結分析

下載2：終身受益的編程指南：Google編程風格指南

在「AI算法與圖像處理」公衆號後臺回覆：c++，即可下載。歷經十年考驗，最權威的編程規範！


     
     
     
 
      
      
      下載3 CVPR2021

     
     
     

     
     
     
 
      
      
      


     
     
     

     
     
     
 
      
      
      在「AI算法與圖像處理」公衆號後臺回覆：
 
      
      
      CVPR
 
      
      
      ，即可下載1467篇CVPR 2020論文 和 CVPR 2021 最新論文

點亮，告訴大家你也在看

本文分享自微信公衆號 - AI算法與圖像處理（AI_study）。
如有侵權，請聯繫 [email protected] 刪除。
本文參與“OSC源創計劃”，歡迎正在閱讀的你也加入，一起分享。

使用Python+OpenCV進行數據增廣方法綜述（附代碼演練）

隨機剪裁

Cutout

ColorJitter

添加噪聲

過濾

總結

《日本蠟燭圖》讀書筆記 & 技術分析回測

Python多線程編程深度探索：從入門到實戰

《期貨-市場技術分析》讀書筆記

mongodb處理json數據很好

頂級 Javaer 都在用的 20 個類庫，真香！

[轉帖]cpupower

google瀏覽器插件開發

35K*14 薪，入職了！這公司只要不裁員，我能一直呆下去！

別魔改網絡了，Google研究員：模型精度不高，是因爲你的Resize方法不夠好！

深度學習中圖像分割經典算法和必備知識點整理

算！力！羊！毛！5000核時計算資源終於開放使用了！

部署教程 | ResNet原理+PyTorch復現+ONNX+TensorRT int8量化部署

YOLOS：通過目標檢測重新思考Transformer（附源代碼）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結