Solve the single data: common data enhancement method summary(数据集扩充)

 

参考文献:https://mp.weixin.qq.com/s?__biz=MzI5MDUyMDIxNA==&mid=2247497597&idx=3&sn=33d62f4f12bb0985185409b1ec6b7aae&chksm=ec1c1a84db6b939251ca7949fc4d9ad8f2a6135fd52f934ead485c9b6566f4a0ab3e37ada363&mpshare=1&scene=24&srcid=&sharer_sharetime=1591629399131&sharer_shareid=cc5ffb1d306d67c81444a3aa7b0ae74c#rd

数据集对于图像处理以及深度学习具有至关重要的作用。对于一些小样本图像数据,需要进行数据集的扩充,您也可以使用现有的深度学习算法对现有数据集进行扩充。数据增广是深度学习中常用的技巧之一,主要用于增加训练数据集,让数据集尽可能的多样化,使得训练的模型具有更强的泛化能力。目前数据增广主要包括:水平/垂直翻转,旋转,缩放,裁剪,剪切,平移,对比度,色彩抖动,噪声等。传统图像算法中,常用几何变换来进行数据增广,其中常用方法有:缩放,平移,旋转,仿射等。

  • Datasets play a vital role in image processing and deep learning.For some small sample image data, the data set needs to be expanded. You can also use the existing deep learning algorithm to expand the existing data set.Data augmentation is one of the commonly used techniques in deep learning. It is mainly used to increase the training data set, make the data set as diverse as possible, and make the training model have stronger generalization ability.Current data enhancements include horizontal/vertical rolloff, rotation, scaling, cropping, cutting, translation, contrast, color jitter, noise, etc.In traditional image algorithms, geometric transformation is commonly used to augment data, and the common methods include scaling, translation, rotation, affine, etc.

向前映射与向后映射

1. 前向映射

图像的几何变换就是建立一种源图像像素与变换后的图像像素之间的映射关系。也正是通过这种映射关系可以知道原图像任意像素点变换后的座标,或者是变换后的图像在原图像的座标位置等。

  • Map forward and map backward

  • 1. Forward mapping

  • The geometric transformation of the image is to establish a mapping relationship between the pixel of the source image and the pixel of the transformed image.It is through this mapping relationship that we can know the coordinates of any pixel points of the original image after transformation, or the coordinate position of the transformed image in the original image, etc.

用简单的数学公式可以表示为:

其中,x,y代表输出图像像素的座标,u,v表示输入图像的像素座标,而U,V表示的是两种映射关系,f是将点(u,v)映射到(x,y)的映射关系,需要说明的是,映射关系可以是线性关系,也可以是多项式关系。

从上面的映射关系可以看到,只要给出了图像上任意的像素座标,都能够通过对应的映射关系获得几何变换后的像素座标。

这种将输入映射到输出的过程我们称之为 “向前映射”。但是在实际应用中,向前映射会出现如下几个问题:

  1. 浮点数座标,如(1,1)映射为(0.5,0.5),显然这是一个无效的座标,这时我们需要使用插值算法进行进一步处理。

  2. 可能会有多个像素座标映射到输出图像的同一位置,也可能输出图像的某些位置完全没有相应的输入图像像素与它匹配,也就是没有被映射到,造成有规律的空洞(黑色的蜂窝状)。

  • Where, x and y represent the coordinates of the output image pixel, u and V represent the coordinates of the input image pixel, and U and V represent two mapping relations. F is the mapping relation of the point (u,v) to (x,y). It should be noted that the mapping relation can be linear or polynomial.

  • As can be seen from the above mapping relation, as long as any pixel coordinates on the image are given, the pixel coordinates after geometric transformation can be obtained through the corresponding mapping relation.

  • This process of mapping inputs to outputs is called forward mapping.However, in practical application, the following problems occur in forward mapping:

  • Floating point number coordinates, such as (1,1) mapped to (0.5, 0.5), are obviously invalid coordinates, and we need to use interpolation algorithm for further processing.

  • There may be multiple pixel coordinates that map to the same location in the output image, or there may be locations in the output image where there is no corresponding input image pixel to match it, i.e., not mapped to, resulting in regular voidage (black honeycomb).

 

 可以看到,旋转三十度后,输出图像两个红色的点被映射到同一个座标,而没有点被映射到绿色问号处,这就造成了间隙和重叠,导致出现蜂窝状空洞。

  • As can be seen, after the rotation of 30 degrees, two red points in the output image are mapped to the same coordinate, while no points are mapped to the green question mark, which causes gaps and overlaps, leading to the honeycomb cavity.

2. 向后映射

为了克服前向映射的这些不足,因此引进了“后向映射”,它的数学表达式为:

  • 2. Backward mapping

  • In order to overcome these shortcomings of forward mapping, "backward mapping" is introduced, and its mathematical expression is:

可以看出,后向映射与前向映射刚好相反,它是由输出图像的像素座标反过来推算该像素为在源图像中的座标位置。这样,输出图像的每个像素值都能够通过这个映射关系找到对应的为止。而不会造成上面所提到的映射不完全和映射重叠的现象。

在实际处理中基本上都运用向后映射来进行图像的几何变换。但是反向映射也有一个和前向映射一样的问题, 就是映射后会有小数,需通过插值方法决定输出图像该位置的值,OpenCV默认为双线性插值。

在使用过程中,如果在一些不改变图像大小的几何变换中,向前映射还是十分有效的,向后映射主要运用在图像的旋转的缩放中,因为这些几何变换都会改变图像的大小。

  • It can be seen that the backward mapping is the opposite of the forward mapping. It calculates the pixel coordinates in the source image in turn from the pixel coordinates of the output image.In this way, each pixel value of the output image can find its corresponding value through this mapping relationship.Without the phenomenon of incomplete and overlapping mappings mentioned above.

  • In practice, the backward mapping is used to transform the image.However, reverse mapping also has the same problem as forward mapping, that is, there will be decimal after mapping, and the value of the position of the output image needs to be determined by interpolation method. OpenCV defaults to bilinear interpolation.

  • In the process of use, forward mapping is still very effective if some geometric transformations do not change the size of the image, and backward mapping is mainly used in the rotation and scaling of the image, because these geometric transformations will change the size of the image.

几何变换

先看第一个问题,变换的形式。在本篇文章里图像的几何变换全部都采用统一的矩阵表示法,形式如下:

  • Geometric transformation,So let's do the first problem, the transformation form.In this paper, the geometric transformation of the image all adopts the unified matrix representation, and the form is as follows:

这就是向前映射的矩阵表示法,其中xy表示输出图像像素的座标,,x0,y0表示输入图像像素的座标,同理,向后映射的矩阵表示为:

  • This is the matrix representation of forward mapping, where XY represents the coordinates of the output image pixel,x0 and y0 represents the coordinates of the input image pixel, similarly, the matrix of backward mapping is expressed as:

可以证明,向后映射的矩阵的表示正好是向前映射的逆变换。

  • It can be proved that the representation of the backward mapping matrix is exactly the inverse transformation of the forward mapping.

下面举几个例子。原图如下:

  • Here are a few examples. The original picture is as follows:

1. 向上平移一个单位向右平移一个单位

  • 1. Shift it up by one and to the right by one

2. 放大为原来的两倍

  • Enlarge it twice as much as before

3. 顺时针旋转45度

  • 3. Rotate 45 degrees clockwise

4. 水平偏移2个单位

  • 4. Shift 2 units horizontally

座标系变换

再看第二个问题,变换中心,对于缩放、平移可以以图像座标原点(图像左上角为原点)为中心变换,这不用座标系变换,直接按照一般形式计算即可。而对于旋转和偏移,一般是以图像中心为原点,那么这就涉及座标系转换了。

我们都知道,图像座标的原点在图像左上角,水平向右为 X 轴,垂直向下为 Y 轴。数学课本中常见的座标系是以图像中心为原点,水平向右为 X 轴,垂直向上为 Y 轴,称为笛卡尔座标系。看下图:

  • Coordinate transformation

  • Let's turn to the second problem, the transformation center. For scaling and translation, the transformation center can be based on the origin of the coordinates of the image (the upper left corner of the image is the origin), which can be directly calculated in the general form instead of coordinate system transformation.For rotation and migration, the origin is usually the center of the image, so this involves coordinate transformation

  • We all know that the origin of the image coordinates is in the upper left corner of the image, the X-axis is horizontal to the right, and the Y-axis is vertical down.The common coordinate system in mathematics textbooks is the cartesian coordinate system, with the center of the image as the origin, the X-axis horizontally to the right, and the Y-axis vertically upward.See below:

因此,对于旋转和偏移,就需要3步(3次变换):

  • 将输入原图图像座标转换为笛卡尔座标系;

  • 进行旋转计算。旋转矩阵前面已经给出了;

  • 将旋转后的图像的笛卡尔座标转回图像座标。

那么,图像座标系与笛卡尔座标系转换关系是什么呢?先看下图:

  • Therefore, for rotation and deviation, three steps (three transformations) are required:

  • Convert the input image coordinates into Cartesian coordinate system;

  • Do the rotation calculation.The rotation matrix was given before;

  • Converts the cartesian coordinates of the rotated image back to the image coordinates.

  • Now, what is the relationship between image coordinates and cartesian coordinates?Take a look at the picture below:

在图像中我们的座标系通常是AB和AC方向的,原点为A,而笛卡尔直角座标系是DE和DF方向的,原点为D。

令图像表示为M×N的矩阵,对于点A而言,两座标系中的座标分别是(0,0)和(-N/2,M/2),则图像某像素点(x',y')转换为笛卡尔座标(x,y)转换关系为,x为列,y为行:

  • In the image, our coordinates are usually in the AB and AC directions, and the origin is A, whereas cartesian coordinates are DE and DF, and the origin is D.

  • Let the image be represented as an M×N matrix. For point A, the coordinates in the two coordinate systems are (0,0) and (-n /2,M/2) respectively. Then the conversion relation of A certain pixel point of the image (x',y') into cartesian coordinates (x,y) is, x is the column, and y is the row:

于是,根据前面说的3个步骤(3次变换),旋转(顺时针旋转)的变换形式就为,3次变换就有3个矩阵:

  • Therefore, according to the three steps mentioned above (three transformations), the transformation form of rotation (clockwise rotation) is, the three transformations have three matrices:

原理说明到这里,接下来就是代码的实现以及测试了。

  • That explains the principle, and then comes the implementation and testing of the code.

1、旋转

 如图:ABCD是变换前矩形,EFGH是变换后的矩形,变换的矩阵表示为:

  • As shown in the figure, ABCD is the rectangle before the transformation, and EFGH is the rectangle after the transformation. The transformation matrix is expressed as:

 

所以,要算旋转后图片的大小,只需计算原图像四个顶点变换后的图像所确定的外接矩形长宽。

因为经过座标变换后的图像是关于原点对称的,所以计算D点变换后的横座标的绝对值乘2,就是变换后矩形的长,计算A点变换后的纵座标的绝对值乘2,就是变换后矩形的宽。

设原图像长为2a,宽为2b,变换后的图像长宽为c,d,则A点的座标为:(-a, b), D点座标:(a, b)

  • Therefore, to calculate the size of the rotated image, we only need to calculate the length and width of the enclosing rectangle determined by the transformed image of the original image's four vertices.

  • Since the image after coordinate transformation is symmetric with respect to the origin, the absolute value of the x-coordinate after transformation of Point D multiplied by 2 is the length of the transformed rectangle, and the absolute value of the y-coordinate after transformation of point A multiplied by 2 is the width of the transformed rectangle.

  • If the original image is 2a and 2b, and the transformed image is c and d, the coordinates of point A are :(-a, b), and the coordinates of point d are :(A, b).

2、平移

3. 仿射变化

 code:

#include <QCoreApplication>


#include "stdlib.h"
#include <opencv2/opencv.hpp>
#include <opencv2/core/core.hpp>

//#include "opencv2/core/cuda.hpp"
//#include "opencv2/cudaimgproc.hpp"
#include <time.h>

using namespace cv;
using namespace std;
namespace GPU = cv::cuda;

void Rotate()
{
      Mat src = imread(".//right_0.jpg");//读取原图像
      Mat dst;
      // 旋转角度
      double angle = 45.0;

      // 计算旋转后输出图形的尺寸
      int rotated_width = ceil(src.rows * fabs(sin(angle * CV_PI / 180)) + src.cols * fabs(cos(angle * CV_PI / 180)));
      int rotated_height = ceil(src.cols * fabs(sin(angle * CV_PI / 180)) + src.rows * fabs(cos(angle * CV_PI / 180)));

      // 计算仿射变换矩阵
      Point2f center(src.cols / 2, src.rows / 2);
      Mat rotate_matrix = getRotationMatrix2D(center, angle, 1.0);

      // 防止切边,对平移矩阵B进行修改
      rotate_matrix.at<double>(0, 2) += (rotated_width - src.cols) / 2;
      rotate_matrix.at<double>(1, 2) += (rotated_height - src.rows) / 2;

      // 应用仿射变换
      warpAffine(src, dst, rotate_matrix, Size(rotated_width, rotated_height), INTER_LINEAR, 0, Scalar(255, 255, 255));
      imshow("result", dst);
      cv::imwrite("right.jpg", dst);
      waitKey();
}

void translation()
{
      Mat src = imread(".//right_0.jpg");//读取原图像
      cv::Mat dst;

      cv::Size dst_sz = src.size();

      //定义平移矩阵
      cv::Mat t_mat =cv::Mat::zeros(2, 3, CV_32FC1);

      t_mat.at<float>(0, 0) = 1;
      t_mat.at<float>(0, 2) = 100; //水平平移量
      t_mat.at<float>(1, 1) = 1;
      t_mat.at<float>(1, 2) = 100; //竖直平移量

      //根据平移矩阵进行仿射变换
      cv::warpAffine(src, dst, t_mat, dst_sz);

      //显示平移效果
      cv::imshow("image", src);
      cv::imshow("result", dst);

      cv::waitKey(0);

}

void Affinechange()
{
        Mat src = imread(".//right_0.jpg");//读取原图像
        //分别在原图像和目标图像上定义三个点
        Point2f srcTri[3];
        Point2f dstTri[3];

        srcTri[0] = Point2f(0, 0);
        srcTri[1] = Point2f(src.cols - 1, 0);
        srcTri[2] = Point2f(0, src.rows - 1);

        dstTri[0] = Point2f(src.cols * 0.0, src.rows * 0.33);
        dstTri[1] = Point2f(src.cols * 0.85, src.rows * 0.25);
        dstTri[2] = Point2f(src.cols * 0.15, src.rows * 0.7);

        Mat dst;//目标图像
        //设置目标图像的大小和类型与原图像一致,初始像素值都为0
        dst = Mat::zeros(src.rows, src.cols, src.type());
        //计算仿射变换矩阵
         Mat trans_mat = getAffineTransform(srcTri, dstTri);
        //对原图像应用上面求得的仿射变换
        warpAffine(src, dst, trans_mat, src.size());

        //显示结果
        imshow("origin_image", src);
        imshow("dst_image", dst);

        //储存图像
        imwrite("dst1.jpg", dst);
        waitKey(0);
}
int main(int argc, char *argv[])
{
    QCoreApplication a(argc, argv);
    //--------------------------------------------------------------------------
    cout << "No CUDA GPU Found! Compute with CPU" << endl;
    clock_t start_all, end_all;
    start_all = clock();
    //testcpu();
    //Rotate();
    translation();
    //Affinechange();
    end_all = clock();
    printf("Total time is %.8f\n", (double)(end_all - start_all) / CLOCKS_PER_SEC);
    //--------------------------------------------------------------------------
    return a.exec();
}

 I hope I can help you,If you have any questions, please  comment on this blog or send me a private message. I will reply in my free time.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章