C++ opencv Table text extraction

今天,研究了一下表格中文字的定位。在我的上一篇博客中已经根据自己的需求进行了文字的定位提取。你可以参考我的这个博客:https://blog.csdn.net/m0_37690102/article/details/106532157

现在,我需要针对表格中的文字进行定位提取。如果直接按照原来的文字提取方法,会提取很多非表格区域的文字信息。我根据自己的需求进行相应的修改。对于表格的定位识别可以参考我的博客:https://blog.csdn.net/m0_37690102/article/details/106447352

。我对之前的代码做了一部分。这部分主要是针对mask图像,为了保证更好的检测到轮廓。便于进行表格的定位提取。

  • Today, I study the placement of words in a table.In my last blog, I have carried out text positioning extraction according to my own needs.You can refer to my blog: https://blog.csdn.net/m0_37690102/article/details/106532157

    Now I need to locate and extract the text in the table.If you follow the original text extraction method directly, you will extract a lot of non-tabular area text information.I make modifications according to my own needs.To form the location identification can refer to my blog: https://blog.csdn.net/m0_37690102/article/details/106447352

    I did a little bit of code.This part is mainly for the mask image, in order to better detect the contour.Convenient for table positioning and extraction.

我的整体思路:

首先是定位提取表格区域,然后表格区域内部的文字定位提取。中间加入了一部分我自己的思路和想法。

  • My overall thinking:

    The first is to locate and extract the table area, and then to locate and extract the text inside the table area.Some of my own thoughts and ideas were added.

本算法的缺点:针对纵向直线和横向直线的检测需要动态计算阈值,这个阈值具体怎么计算值得研究一下。

                           int scale = 64; //<------------------------>  动态计算阈值

                           int horizontalsize = horizontal.cols / scale; //在水平轴上指定尺寸

来自一个求学途中小小白的请求:如果您有更好的思路或者计算的方法,如果您不介意的话,可以分享一下,我也跟着学习学习。非常感谢您的指导。

  • Disadvantages of this algorithm: The detection of vertical and horizontal lines requires the dynamic calculation of the threshold, the specific calculation of the threshold is worth studying.

                           int scale = 64; //<------------------------>  动态计算阈值

                           int horizontalsize = horizontal.cols / scale; //在水平轴上指定尺寸

  • A request from Xiaobai on the way to school: If you have a better idea or calculation method, if you don't mind, you can share it and I will learn from it.Thank you very much.

vector<Mat> Tableextratcion(Mat src)
{
	// 检查图像是否加载良好
	if (!src.data)
	{
		cerr << "Problem loading image!!!" << endl;
		exit(1);
	}
	//    // Show source image
	//    imshow("src", src);

	// 出于实际原因调整大小
	//Mat rsz;
	//Size size(800, 900);
	//resize(src, rsz, size);
	//resize(src, src, Size(src.cols / 1.5, src.rows / 1.5), 0, 0, INTER_LINEAR);
	//imshow("rsz", rsz);

	// 源图像转换为灰度图
	Mat gray;

	if (src.channels() == 3)
	{
		cvtColor(src, gray, CV_BGR2GRAY);
	}
	else
	{
		gray = src;
	}
	//imshow("gray", gray);// Show gray image

	// Apply adaptiveThreshold at the bitwise_not of gray, notice the ~ symbol
	Mat bw;
	threshold(gray, bw, 0, 255, THRESH_BINARY_INV | THRESH_OTSU);
	//threshold(~gray, bw, 170, 255, CV_THRESH_BINARY);
	//imshow("binary", bw);
	//创建将用于提取水平线和垂直线的图像
	Mat horizontal = bw.clone();
	Mat vertical = bw.clone();

	int scale = 64; //<------------------------>  动态计算阈值

	int horizontalsize = horizontal.cols / scale; //在水平轴上指定尺寸

	//创建用于通过形态学操作提取水平线的结构元素
	Mat horizontalStructure = getStructuringElement(MORPH_RECT/*内核的形状是矩形*/, Size(horizontalsize, 1)/*内核尺寸*/);

	// 应用形态学运算,先腐蚀再膨胀
	erode(horizontal, horizontal, horizontalStructure, Point(-1, -1));
	dilate(horizontal, horizontal, horizontalStructure, Point(-1, -1));

	// 显示提取的水平线
	//imshow("horizontal", horizontal);
	// Specify size on vertical axis
	int verticalsize = vertical.rows / scale;

	// 创建用于通过形态学操作提取垂直线的结构元素
	Mat verticalStructure = getStructuringElement(MORPH_RECT, Size(1, verticalsize));

	// 应用形态学运算,先腐蚀再膨胀
	erode(vertical, vertical, verticalStructure, Point(-1, -1));
	dilate(vertical, vertical, verticalStructure, Point(-1, -1));

	// 显示提取的垂直线
	//imshow("vertical", vertical);
	//创建一个包含表格的遮罩
	Mat mask = horizontal + vertical;
	//-------------------------------------------------------------------------------
	//膨胀加强表格框线
	//-------------------------------------------------------------------------------
	Mat kernel = getStructuringElement(MORPH_RECT, Size(3, 3), Point(-1, -1));
	dilate(mask, mask, kernel);
	//imshow("mask", mask);
	imwrite("./title_time/Mask.jpg",mask);
	//-------------------------------------------------------------------------------
	//-------------------------------------------------------------------------------
	/*找到表格线之间的接缝,我们将使用此信息从图片中区分表格
	(表格将包含4个以上的接缝,而图片仅包含4个接缝(即,在角落处))*/
	Mat joints;
	bitwise_and(horizontal, vertical, joints);
	//imshow("joints", joints);
	//查找外部轮廓,该轮廓很可能属于表格或图像
	vector<Vec4i> hierarchy;
	std::vector<std::vector<cv::Point> > contours;
	cv::findContours(mask,      //输入图像
		contours,               //检测到的轮廓,每个轮廓被表示成一个point向量
		hierarchy,              //可选的输出向量,包含图像的拓扑信息。其中元素的个数和检测到的轮廓的数量相等
		CV_RETR_EXTERNAL,       //说明需要的轮廓类型和希望的返回值方式,CV_RETR_EXTERNAL 只检测出最外轮廓
		CV_CHAIN_APPROX_SIMPLE, //压缩水平,垂直或斜的部分,只保存最后一个点
		Point(0, 0));
	//contours代表输出的多个轮廓
	vector<vector<Point> > contours_poly(contours.size());//描述多个轮廓,即将多个轮廓存在一个vector中
	vector<Rect> boundRect(contours.size());
	vector<Mat> rois;

	for (size_t i = 0; i < contours.size(); i++)
	{
		//获取区域的面积,如果小于某个值就忽略,代表是杂线不是表格
		double area = contourArea(contours[i]);

		if (area < 40) // 值是随机选择的,需要通过反复试验程序自行找到该值
			continue;

		//approxPolyDP 函数用来逼近区域成为一个形状,true值表示产生的区域为闭合区域
		//boundingRect 函数为将这片区域转化为矩形,此矩形包含输入的形状
		approxPolyDP(Mat(contours[i]), contours_poly[i], 10, true);
		boundRect[i] = boundingRect(Mat(contours_poly[i]));//获取最小外接矩形

		// 查找每个表具有的节点数
		Mat roi = joints(boundRect[i]);

		vector<vector<Point> > joints_contours;
		findContours(roi, joints_contours, RETR_CCOMP, CHAIN_APPROX_SIMPLE);

		// 从表格的特性看,如果这片区域的点数小于4,那就代表没有一个完整的表格,忽略掉
		if (joints_contours.size() <= 4)
			continue;
		//保存这片区域
		//rois.push_back(src(boundRect[i]).clone());
		int x0 = 0, y0 = 0, w0 = 0, h0 = 0;
		x0 = boundRect[i].x;
		y0 = boundRect[i].y;
		w0 = boundRect[i].width;
		h0 = boundRect[i].height;
		//rois.push_back(src(boundRect[i]).clone());
		Mat rectangle_roi;
		if (x0 - 10 >= 0 && y0 - 10 >= 0 && w0 + 20 >= 0 && h0 + 50 >= 0
			&& x0 - 10 + w0 + 20 <= src.cols
			&& y0 - 10 + h0 + 50 <= src.rows)
		{
			Rect m_select(
				(x0 - 10),
				(y0 - 10),//开始的Y座标//Min_Y[0] + delta_title
				(w0 + 20),
				(h0 + 50));//终止的Y座标减去初始的Y座标 //Variable_Y_End[1] - Vertical_Black_Y[0] +50
			rectangle_roi = src(m_select);//对目标图像进行裁剪保存
		}
		else if (x0 >= 0 && y0 >= 0 && w0 >= 0 && h0 >= 0
			&& x0+w0 <= src.cols
			&& y0+h0 <= src.rows)
		{
			Rect m_select(
				(x0),
				(y0),//开始的Y座标//Min_Y[0] + delta_title
				(w0),
				(h0));//终止的Y座标减去初始的Y座标 //Variable_Y_End[1] - Vertical_Black_Y[0] +50
			rectangle_roi = src(m_select);//对目标图像进行裁剪保存
		}
		rois.push_back(rectangle_roi);
		rectangle(src, Point(x0 - 10, y0 - 10), Point(x0 + w0 + 10, y0 + h0 + 10), Scalar(0, 255, 0), 2, 8);
	}
	for (size_t i = 0; i < rois.size(); ++i)
	{
		//现在,可以执行所需的任何后期处理矩形/表格中的数据。
		std::stringstream ss;
		ss << "roi" << i << "";
		//imshow(ss.str(), rois[i]);
		string Img_Name = ".//All" + to_string(i) + ".jpg";
		imwrite(Img_Name, rois[i]);
		waitKey();
	}
	imshow("contours", src);
	waitKey(0);
	return rois;
}

接下来就是对提取到的表格进行处理。针对sobel算子输出的图像存在纵向的直线,首先进行3*3尺寸膨胀增强,其次构造检测纵向直线结构元素,保证sobel算子输出图像中只包含文字区域。为什么要加入这个3*3膨胀呢,就是为了让sobel算子输出的白色区域更为明显。通过对开源数据集中包含闭合区域的表格进行定位检测。效果还是不错的。看看我的代码:

  • The next step is to process the extracted table.There are vertical lines in the images output by Sobel operator. Firstly, 3*3 dimensional expansion and enhancement are carried out. Secondly, longitudinal line structural elements are constructed to ensure that only text areas are included in the images output by Sobel operator.So why do we add this 3 by 3 expansion, just to make the white area of the sobel operator output more obvious.Through the open source data set containing closed areas of the table location detection.The results are still good.Look at my code:
Mat Imagepreprocess(Mat gray)
{
	if (!gray.data)
	{
		cerr << "Problem loading image!!!" << endl;
		exit(1);
	}
	//1.Sobel算子,x方向求梯度
	Mat sobel;
	Sobel(gray, sobel, CV_8U, 1, 0, 3);

	//2.OTSU 二值化
	Mat binary;
	threshold(sobel, binary, 0, 255, THRESH_OTSU + THRESH_BINARY);
	//--------------------------------------------------------------------------
	Mat kernel_gray = getStructuringElement(MORPH_RECT, Size(3, 3), Point(-1, -1));
	dilate(binary, binary, kernel_gray);
	//--------------------------------------------------------------------------
	Mat vertical_img = binary.clone();
	int scale = 10; //<----------------------->   需要动态计算这个阈值
	int verticalsize = vertical_img.rows / scale;
	// 创建用于通过形态学操作提取垂直线的结构元素
	Mat verticalStructure = getStructuringElement(MORPH_RECT, Size(1, verticalsize));
	// 应用形态学运算,先腐蚀再膨胀
	erode(vertical_img, vertical_img, verticalStructure, Point(-1, -1));
	dilate(vertical_img, vertical_img, verticalStructure, Point(-1, -1));
	Mat kernel = getStructuringElement(MORPH_RECT, Size(3, 3), Point(-1, -1));
	dilate(vertical_img, vertical_img, kernel);
	//----------------------------------------------------------------------
	//imshow("vertical_img", vertical_img);
	Mat OnlyTextimage = binary - vertical_img;
	//imshow("OnlyTextimage", OnlyTextimage);
	//--------------------------------------------------------------------------
	//3.膨胀和腐蚀操作核设定
	Mat element_1 = getStructuringElement(MORPH_RECT, Size(30, 9));
	//控制高度设置可以控制上下行的膨胀程度,例如3比4的区分能力更强,但也会造成漏检
	Mat element_2 = getStructuringElement(MORPH_RECT, Size(24, 4));

	//4.膨胀一次,让轮廓突出
	Mat dilate_1;
	dilate(OnlyTextimage, dilate_1, element_2);

	//5.腐蚀一次,去掉细节,表格线等。这里去掉的是竖直的线
	Mat erode_1;
	erode(dilate_1, erode_1, element_1);

	//6.再次膨胀,让轮廓明显一些
	Mat dilate_2;
	dilate(erode_1, dilate_2, element_2);
	//--------------------------------------------------------------------------
	//Mat kernel_dilate = getStructuringElement(MORPH_RECT, Size(20, 10), Point(-1, -1));
	//dilate(dilate_2, dilate_2, kernel_dilate);
	//--------------------------------------------------------------------------
	//7.查看输出结果
	imwrite("./title_time/output/binary.jpg", binary);
	imwrite("./title_time/output/dilate1.jpg", dilate_1);
	imwrite("./title_time/output/erode1.jpg", erode_1);
	imwrite("./title_time/output/dilate2.jpg", dilate_2);
	imwrite("./title_time/output/vertical_img.jpg", vertical_img);
	imwrite("./title_time/output/OnlyTextimage.jpg", OnlyTextimage);

	return dilate_2;
}

看看实验效果:

  • the results:

Table extraction

one table: 

 

 

I hope I can help you,If you have any questions, please  comment on this blog or send me a private message. I will reply in my free time.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章