Solve the single data: common data enhancement method summary(數據集擴充)

 

參考文獻:https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247497597&idx=3&sn=33d62f4f12bb0985185409b1ec6b7aae&chksm=ec1c1a84db6b939251ca7949fc4d9ad8f2a6135fd52f934ead485c9b6566f4a0ab3e37ada363&mpshare=1&scene=24&srcid=&sharer_sharetime=1591629399131&sharer_shareid=cc5ffb1d306d67c81444a3aa7b0ae74c#rd

數據集對於圖像處理以及深度學習具有至關重要的作用。對於一些小樣本圖像數據,需要進行數據集的擴充,您也可以使用現有的深度學習算法對現有數據集進行擴充。數據增廣是深度學習中常用的技巧之一,主要用於增加訓練數據集,讓數據集儘可能的多樣化,使得訓練的模型具有更強的泛化能力。目前數據增廣主要包括:水平/垂直翻轉,旋轉,縮放,裁剪,剪切,平移,對比度,色彩抖動,噪聲等。傳統圖像算法中,常用幾何變換來進行數據增廣,其中常用方法有:縮放,平移,旋轉,仿射等。

  • Datasets play a vital role in image processing and deep learning.For some small sample image data, the data set needs to be expanded. You can also use the existing deep learning algorithm to expand the existing data set.Data augmentation is one of the commonly used techniques in deep learning. It is mainly used to increase the training data set, make the data set as diverse as possible, and make the training model have stronger generalization ability.Current data enhancements include horizontal/vertical rolloff, rotation, scaling, cropping, cutting, translation, contrast, color jitter, noise, etc.In traditional image algorithms, geometric transformation is commonly used to augment data, and the common methods include scaling, translation, rotation, affine, etc.

向前映射與向後映射

1. 前向映射

圖像的幾何變換就是建立一種源圖像像素與變換後的圖像像素之間的映射關係。也正是通過這種映射關係可以知道原圖像任意像素點變換後的座標,或者是變換後的圖像在原圖像的座標位置等。

  • Map forward and map backward

  • 1. Forward mapping

  • The geometric transformation of the image is to establish a mapping relationship between the pixel of the source image and the pixel of the transformed image.It is through this mapping relationship that we can know the coordinates of any pixel points of the original image after transformation, or the coordinate position of the transformed image in the original image, etc.

用簡單的數學公式可以表示爲:

其中,x,y代表輸出圖像像素的座標,u,v表示輸入圖像的像素座標,而U,V表示的是兩種映射關係,f是將點(u,v)映射到(x,y)的映射關係,需要說明的是,映射關係可以是線性關係,也可以是多項式關係。

從上面的映射關係可以看到,只要給出了圖像上任意的像素座標,都能夠通過對應的映射關係獲得幾何變換後的像素座標。

這種將輸入映射到輸出的過程我們稱之爲 “向前映射”。但是在實際應用中,向前映射會出現如下幾個問題:

  1. 浮點數座標,如(1,1)映射爲(0.5,0.5),顯然這是一個無效的座標,這時我們需要使用插值算法進行進一步處理。

  2. 可能會有多個像素座標映射到輸出圖像的同一位置,也可能輸出圖像的某些位置完全沒有相應的輸入圖像像素與它匹配,也就是沒有被映射到,造成有規律的空洞(黑色的蜂窩狀)。

  • Where, x and y represent the coordinates of the output image pixel, u and V represent the coordinates of the input image pixel, and U and V represent two mapping relations. F is the mapping relation of the point (u,v) to (x,y). It should be noted that the mapping relation can be linear or polynomial.

  • As can be seen from the above mapping relation, as long as any pixel coordinates on the image are given, the pixel coordinates after geometric transformation can be obtained through the corresponding mapping relation.

  • This process of mapping inputs to outputs is called forward mapping.However, in practical application, the following problems occur in forward mapping:

  • Floating point number coordinates, such as (1,1) mapped to (0.5, 0.5), are obviously invalid coordinates, and we need to use interpolation algorithm for further processing.

  • There may be multiple pixel coordinates that map to the same location in the output image, or there may be locations in the output image where there is no corresponding input image pixel to match it, i.e., not mapped to, resulting in regular voidage (black honeycomb).

 

 可以看到,旋轉三十度後,輸出圖像兩個紅色的點被映射到同一個座標,而沒有點被映射到綠色問號處,這就造成了間隙和重疊,導致出現蜂窩狀空洞。

  • As can be seen, after the rotation of 30 degrees, two red points in the output image are mapped to the same coordinate, while no points are mapped to the green question mark, which causes gaps and overlaps, leading to the honeycomb cavity.

2. 向後映射

爲了克服前向映射的這些不足,因此引進了“後向映射”,它的數學表達式爲:

  • 2. Backward mapping

  • In order to overcome these shortcomings of forward mapping, "backward mapping" is introduced, and its mathematical expression is:

可以看出,後向映射與前向映射剛好相反,它是由輸出圖像的像素座標反過來推算該像素爲在源圖像中的座標位置。這樣,輸出圖像的每個像素值都能夠通過這個映射關係找到對應的爲止。而不會造成上面所提到的映射不完全和映射重疊的現象。

在實際處理中基本上都運用向後映射來進行圖像的幾何變換。但是反向映射也有一個和前向映射一樣的問題, 就是映射後會有小數,需通過插值方法決定輸出圖像該位置的值,OpenCV默認爲雙線性插值。

在使用過程中,如果在一些不改變圖像大小的幾何變換中,向前映射還是十分有效的,向後映射主要運用在圖像的旋轉的縮放中,因爲這些幾何變換都會改變圖像的大小。

  • It can be seen that the backward mapping is the opposite of the forward mapping. It calculates the pixel coordinates in the source image in turn from the pixel coordinates of the output image.In this way, each pixel value of the output image can find its corresponding value through this mapping relationship.Without the phenomenon of incomplete and overlapping mappings mentioned above.

  • In practice, the backward mapping is used to transform the image.However, reverse mapping also has the same problem as forward mapping, that is, there will be decimal after mapping, and the value of the position of the output image needs to be determined by interpolation method. OpenCV defaults to bilinear interpolation.

  • In the process of use, forward mapping is still very effective if some geometric transformations do not change the size of the image, and backward mapping is mainly used in the rotation and scaling of the image, because these geometric transformations will change the size of the image.

幾何變換

先看第一個問題,變換的形式。在本篇文章裏圖像的幾何變換全部都採用統一的矩陣表示法,形式如下:

  • Geometric transformation,So let's do the first problem, the transformation form.In this paper, the geometric transformation of the image all adopts the unified matrix representation, and the form is as follows:

這就是向前映射的矩陣表示法,其中xy表示輸出圖像像素的座標,,x0,y0表示輸入圖像像素的座標,同理,向後映射的矩陣表示爲:

  • This is the matrix representation of forward mapping, where XY represents the coordinates of the output image pixel,x0 and y0 represents the coordinates of the input image pixel, similarly, the matrix of backward mapping is expressed as:

可以證明,向後映射的矩陣的表示正好是向前映射的逆變換。

  • It can be proved that the representation of the backward mapping matrix is exactly the inverse transformation of the forward mapping.

下面舉幾個例子。原圖如下:

  • Here are a few examples. The original picture is as follows:

1. 向上平移一個單位向右平移一個單位

  • 1. Shift it up by one and to the right by one

2. 放大爲原來的兩倍

  • Enlarge it twice as much as before

3. 順時針旋轉45度

  • 3. Rotate 45 degrees clockwise

4. 水平偏移2個單位

  • 4. Shift 2 units horizontally

座標系變換

再看第二個問題,變換中心,對於縮放、平移可以以圖像座標原點(圖像左上角爲原點)爲中心變換,這不用座標系變換,直接按照一般形式計算即可。而對於旋轉和偏移,一般是以圖像中心爲原點,那麼這就涉及座標系轉換了。

我們都知道,圖像座標的原點在圖像左上角,水平向右爲 X 軸,垂直向下爲 Y 軸。數學課本中常見的座標系是以圖像中心爲原點,水平向右爲 X 軸,垂直向上爲 Y 軸,稱爲笛卡爾座標系。看下圖:

  • Coordinate transformation

  • Let's turn to the second problem, the transformation center. For scaling and translation, the transformation center can be based on the origin of the coordinates of the image (the upper left corner of the image is the origin), which can be directly calculated in the general form instead of coordinate system transformation.For rotation and migration, the origin is usually the center of the image, so this involves coordinate transformation

  • We all know that the origin of the image coordinates is in the upper left corner of the image, the X-axis is horizontal to the right, and the Y-axis is vertical down.The common coordinate system in mathematics textbooks is the cartesian coordinate system, with the center of the image as the origin, the X-axis horizontally to the right, and the Y-axis vertically upward.See below:

因此,對於旋轉和偏移,就需要3步(3次變換):

  • 將輸入原圖圖像座標轉換爲笛卡爾座標系;

  • 進行旋轉計算。旋轉矩陣前面已經給出了;

  • 將旋轉後的圖像的笛卡爾座標轉回圖像座標。

那麼,圖像座標系與笛卡爾座標系轉換關係是什麼呢?先看下圖:

  • Therefore, for rotation and deviation, three steps (three transformations) are required:

  • Convert the input image coordinates into Cartesian coordinate system;

  • Do the rotation calculation.The rotation matrix was given before;

  • Converts the cartesian coordinates of the rotated image back to the image coordinates.

  • Now, what is the relationship between image coordinates and cartesian coordinates?Take a look at the picture below:

在圖像中我們的座標系通常是AB和AC方向的,原點爲A,而笛卡爾直角座標系是DE和DF方向的,原點爲D。

令圖像表示爲M×N的矩陣,對於點A而言,兩座標系中的座標分別是(0,0)和(-N/2,M/2),則圖像某像素點(x',y')轉換爲笛卡爾座標(x,y)轉換關係爲,x爲列,y爲行:

  • In the image, our coordinates are usually in the AB and AC directions, and the origin is A, whereas cartesian coordinates are DE and DF, and the origin is D.

  • Let the image be represented as an M×N matrix. For point A, the coordinates in the two coordinate systems are (0,0) and (-n /2,M/2) respectively. Then the conversion relation of A certain pixel point of the image (x',y') into cartesian coordinates (x,y) is, x is the column, and y is the row:

於是,根據前面說的3個步驟(3次變換),旋轉(順時針旋轉)的變換形式就爲,3次變換就有3個矩陣:

  • Therefore, according to the three steps mentioned above (three transformations), the transformation form of rotation (clockwise rotation) is, the three transformations have three matrices:

原理說明到這裏,接下來就是代碼的實現以及測試了。

  • That explains the principle, and then comes the implementation and testing of the code.

1、旋轉

 如圖:ABCD是變換前矩形,EFGH是變換後的矩形,變換的矩陣表示爲:

  • As shown in the figure, ABCD is the rectangle before the transformation, and EFGH is the rectangle after the transformation. The transformation matrix is expressed as:

 

所以,要算旋轉後圖片的大小,只需計算原圖像四個頂點變換後的圖像所確定的外接矩形長寬。

因爲經過座標變換後的圖像是關於原點對稱的,所以計算D點變換後的橫座標的絕對值乘2,就是變換後矩形的長,計算A點變換後的縱座標的絕對值乘2,就是變換後矩形的寬。

設原圖像長爲2a,寬爲2b,變換後的圖像長寬爲c,d,則A點的座標爲:(-a, b), D點座標:(a, b)

  • Therefore, to calculate the size of the rotated image, we only need to calculate the length and width of the enclosing rectangle determined by the transformed image of the original image's four vertices.

  • Since the image after coordinate transformation is symmetric with respect to the origin, the absolute value of the x-coordinate after transformation of Point D multiplied by 2 is the length of the transformed rectangle, and the absolute value of the y-coordinate after transformation of point A multiplied by 2 is the width of the transformed rectangle.

  • If the original image is 2a and 2b, and the transformed image is c and d, the coordinates of point A are :(-a, b), and the coordinates of point d are :(A, b).

2、平移

3. 仿射變化

 code:

#include <QCoreApplication>


#include "stdlib.h"
#include <opencv2/opencv.hpp>
#include <opencv2/core/core.hpp>

//#include "opencv2/core/cuda.hpp"
//#include "opencv2/cudaimgproc.hpp"
#include <time.h>

using namespace cv;
using namespace std;
namespace GPU = cv::cuda;

void Rotate()
{
      Mat src = imread(".//right_0.jpg");//讀取原圖像
      Mat dst;
      // 旋轉角度
      double angle = 45.0;

      // 計算旋轉後輸出圖形的尺寸
      int rotated_width = ceil(src.rows * fabs(sin(angle * CV_PI / 180)) + src.cols * fabs(cos(angle * CV_PI / 180)));
      int rotated_height = ceil(src.cols * fabs(sin(angle * CV_PI / 180)) + src.rows * fabs(cos(angle * CV_PI / 180)));

      // 計算仿射變換矩陣
      Point2f center(src.cols / 2, src.rows / 2);
      Mat rotate_matrix = getRotationMatrix2D(center, angle, 1.0);

      // 防止切邊,對平移矩陣B進行修改
      rotate_matrix.at<double>(0, 2) += (rotated_width - src.cols) / 2;
      rotate_matrix.at<double>(1, 2) += (rotated_height - src.rows) / 2;

      // 應用仿射變換
      warpAffine(src, dst, rotate_matrix, Size(rotated_width, rotated_height), INTER_LINEAR, 0, Scalar(255, 255, 255));
      imshow("result", dst);
      cv::imwrite("right.jpg", dst);
      waitKey();
}

void translation()
{
      Mat src = imread(".//right_0.jpg");//讀取原圖像
      cv::Mat dst;

      cv::Size dst_sz = src.size();

      //定義平移矩陣
      cv::Mat t_mat =cv::Mat::zeros(2, 3, CV_32FC1);

      t_mat.at<float>(0, 0) = 1;
      t_mat.at<float>(0, 2) = 100; //水平平移量
      t_mat.at<float>(1, 1) = 1;
      t_mat.at<float>(1, 2) = 100; //豎直平移量

      //根據平移矩陣進行仿射變換
      cv::warpAffine(src, dst, t_mat, dst_sz);

      //顯示平移效果
      cv::imshow("image", src);
      cv::imshow("result", dst);

      cv::waitKey(0);

}

void Affinechange()
{
        Mat src = imread(".//right_0.jpg");//讀取原圖像
        //分別在原圖像和目標圖像上定義三個點
        Point2f srcTri[3];
        Point2f dstTri[3];

        srcTri[0] = Point2f(0, 0);
        srcTri[1] = Point2f(src.cols - 1, 0);
        srcTri[2] = Point2f(0, src.rows - 1);

        dstTri[0] = Point2f(src.cols * 0.0, src.rows * 0.33);
        dstTri[1] = Point2f(src.cols * 0.85, src.rows * 0.25);
        dstTri[2] = Point2f(src.cols * 0.15, src.rows * 0.7);

        Mat dst;//目標圖像
        //設置目標圖像的大小和類型與原圖像一致,初始像素值都爲0
        dst = Mat::zeros(src.rows, src.cols, src.type());
        //計算仿射變換矩陣
         Mat trans_mat = getAffineTransform(srcTri, dstTri);
        //對原圖像應用上面求得的仿射變換
        warpAffine(src, dst, trans_mat, src.size());

        //顯示結果
        imshow("origin_image", src);
        imshow("dst_image", dst);

        //儲存圖像
        imwrite("dst1.jpg", dst);
        waitKey(0);
}
int main(int argc, char *argv[])
{
    QCoreApplication a(argc, argv);
    //--------------------------------------------------------------------------
    cout << "No CUDA GPU Found! Compute with CPU" << endl;
    clock_t start_all, end_all;
    start_all = clock();
    //testcpu();
    //Rotate();
    translation();
    //Affinechange();
    end_all = clock();
    printf("Total time is %.8f\n", (double)(end_all - start_all) / CLOCKS_PER_SEC);
    //--------------------------------------------------------------------------
    return a.exec();
}

 I hope I can help you,If you have any questions, please  comment on this blog or send me a private message. I will reply in my free time.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章