C++ opencv Table text extraction

今天,研究了一下表格中文字的定位。在我的上一篇博客中已經根據自己的需求進行了文字的定位提取。你可以參考我的這個博客:https://blog.csdn.net/m0_37690102/article/details/106532157

現在,我需要針對表格中的文字進行定位提取。如果直接按照原來的文字提取方法,會提取很多非表格區域的文字信息。我根據自己的需求進行相應的修改。對於表格的定位識別可以參考我的博客:https://blog.csdn.net/m0_37690102/article/details/106447352

。我對之前的代碼做了一部分。這部分主要是針對mask圖像,爲了保證更好的檢測到輪廓。便於進行表格的定位提取。

  • Today, I study the placement of words in a table.In my last blog, I have carried out text positioning extraction according to my own needs.You can refer to my blog: https://blog.csdn.net/m0_37690102/article/details/106532157

    Now I need to locate and extract the text in the table.If you follow the original text extraction method directly, you will extract a lot of non-tabular area text information.I make modifications according to my own needs.To form the location identification can refer to my blog: https://blog.csdn.net/m0_37690102/article/details/106447352

    I did a little bit of code.This part is mainly for the mask image, in order to better detect the contour.Convenient for table positioning and extraction.

我的整體思路:

首先是定位提取表格區域,然後表格區域內部的文字定位提取。中間加入了一部分我自己的思路和想法。

  • My overall thinking:

    The first is to locate and extract the table area, and then to locate and extract the text inside the table area.Some of my own thoughts and ideas were added.

本算法的缺點:針對縱向直線和橫向直線的檢測需要動態計算閾值,這個閾值具體怎麼計算值得研究一下。

                           int scale = 64; //<------------------------>  動態計算閾值

                           int horizontalsize = horizontal.cols / scale; //在水平軸上指定尺寸

來自一個求學途中小小白的請求:如果您有更好的思路或者計算的方法,如果您不介意的話,可以分享一下,我也跟着學習學習。非常感謝您的指導。

  • Disadvantages of this algorithm: The detection of vertical and horizontal lines requires the dynamic calculation of the threshold, the specific calculation of the threshold is worth studying.

                           int scale = 64; //<------------------------>  動態計算閾值

                           int horizontalsize = horizontal.cols / scale; //在水平軸上指定尺寸

  • A request from Xiaobai on the way to school: If you have a better idea or calculation method, if you don't mind, you can share it and I will learn from it.Thank you very much.

vector<Mat> Tableextratcion(Mat src)
{
	// 檢查圖像是否加載良好
	if (!src.data)
	{
		cerr << "Problem loading image!!!" << endl;
		exit(1);
	}
	//    // Show source image
	//    imshow("src", src);

	// 出於實際原因調整大小
	//Mat rsz;
	//Size size(800, 900);
	//resize(src, rsz, size);
	//resize(src, src, Size(src.cols / 1.5, src.rows / 1.5), 0, 0, INTER_LINEAR);
	//imshow("rsz", rsz);

	// 源圖像轉換爲灰度圖
	Mat gray;

	if (src.channels() == 3)
	{
		cvtColor(src, gray, CV_BGR2GRAY);
	}
	else
	{
		gray = src;
	}
	//imshow("gray", gray);// Show gray image

	// Apply adaptiveThreshold at the bitwise_not of gray, notice the ~ symbol
	Mat bw;
	threshold(gray, bw, 0, 255, THRESH_BINARY_INV | THRESH_OTSU);
	//threshold(~gray, bw, 170, 255, CV_THRESH_BINARY);
	//imshow("binary", bw);
	//創建將用於提取水平線和垂直線的圖像
	Mat horizontal = bw.clone();
	Mat vertical = bw.clone();

	int scale = 64; //<------------------------>  動態計算閾值

	int horizontalsize = horizontal.cols / scale; //在水平軸上指定尺寸

	//創建用於通過形態學操作提取水平線的結構元素
	Mat horizontalStructure = getStructuringElement(MORPH_RECT/*內核的形狀是矩形*/, Size(horizontalsize, 1)/*內核尺寸*/);

	// 應用形態學運算,先腐蝕再膨脹
	erode(horizontal, horizontal, horizontalStructure, Point(-1, -1));
	dilate(horizontal, horizontal, horizontalStructure, Point(-1, -1));

	// 顯示提取的水平線
	//imshow("horizontal", horizontal);
	// Specify size on vertical axis
	int verticalsize = vertical.rows / scale;

	// 創建用於通過形態學操作提取垂直線的結構元素
	Mat verticalStructure = getStructuringElement(MORPH_RECT, Size(1, verticalsize));

	// 應用形態學運算,先腐蝕再膨脹
	erode(vertical, vertical, verticalStructure, Point(-1, -1));
	dilate(vertical, vertical, verticalStructure, Point(-1, -1));

	// 顯示提取的垂直線
	//imshow("vertical", vertical);
	//創建一個包含表格的遮罩
	Mat mask = horizontal + vertical;
	//-------------------------------------------------------------------------------
	//膨脹加強表格框線
	//-------------------------------------------------------------------------------
	Mat kernel = getStructuringElement(MORPH_RECT, Size(3, 3), Point(-1, -1));
	dilate(mask, mask, kernel);
	//imshow("mask", mask);
	imwrite("./title_time/Mask.jpg",mask);
	//-------------------------------------------------------------------------------
	//-------------------------------------------------------------------------------
	/*找到表格線之間的接縫,我們將使用此信息從圖片中區分表格
	(表格將包含4個以上的接縫,而圖片僅包含4個接縫(即,在角落處))*/
	Mat joints;
	bitwise_and(horizontal, vertical, joints);
	//imshow("joints", joints);
	//查找外部輪廓,該輪廓很可能屬於表格或圖像
	vector<Vec4i> hierarchy;
	std::vector<std::vector<cv::Point> > contours;
	cv::findContours(mask,      //輸入圖像
		contours,               //檢測到的輪廓,每個輪廓被表示成一個point向量
		hierarchy,              //可選的輸出向量,包含圖像的拓撲信息。其中元素的個數和檢測到的輪廓的數量相等
		CV_RETR_EXTERNAL,       //說明需要的輪廓類型和希望的返回值方式,CV_RETR_EXTERNAL 只檢測出最外輪廓
		CV_CHAIN_APPROX_SIMPLE, //壓縮水平,垂直或斜的部分,只保存最後一個點
		Point(0, 0));
	//contours代表輸出的多個輪廓
	vector<vector<Point> > contours_poly(contours.size());//描述多個輪廓,即將多個輪廓存在一個vector中
	vector<Rect> boundRect(contours.size());
	vector<Mat> rois;

	for (size_t i = 0; i < contours.size(); i++)
	{
		//獲取區域的面積,如果小於某個值就忽略,代表是雜線不是表格
		double area = contourArea(contours[i]);

		if (area < 40) // 值是隨機選擇的,需要通過反覆試驗程序自行找到該值
			continue;

		//approxPolyDP 函數用來逼近區域成爲一個形狀,true值表示產生的區域爲閉合區域
		//boundingRect 函數爲將這片區域轉化爲矩形,此矩形包含輸入的形狀
		approxPolyDP(Mat(contours[i]), contours_poly[i], 10, true);
		boundRect[i] = boundingRect(Mat(contours_poly[i]));//獲取最小外接矩形

		// 查找每個表具有的節點數
		Mat roi = joints(boundRect[i]);

		vector<vector<Point> > joints_contours;
		findContours(roi, joints_contours, RETR_CCOMP, CHAIN_APPROX_SIMPLE);

		// 從表格的特性看,如果這片區域的點數小於4,那就代表沒有一個完整的表格,忽略掉
		if (joints_contours.size() <= 4)
			continue;
		//保存這片區域
		//rois.push_back(src(boundRect[i]).clone());
		int x0 = 0, y0 = 0, w0 = 0, h0 = 0;
		x0 = boundRect[i].x;
		y0 = boundRect[i].y;
		w0 = boundRect[i].width;
		h0 = boundRect[i].height;
		//rois.push_back(src(boundRect[i]).clone());
		Mat rectangle_roi;
		if (x0 - 10 >= 0 && y0 - 10 >= 0 && w0 + 20 >= 0 && h0 + 50 >= 0
			&& x0 - 10 + w0 + 20 <= src.cols
			&& y0 - 10 + h0 + 50 <= src.rows)
		{
			Rect m_select(
				(x0 - 10),
				(y0 - 10),//開始的Y座標//Min_Y[0] + delta_title
				(w0 + 20),
				(h0 + 50));//終止的Y座標減去初始的Y座標 //Variable_Y_End[1] - Vertical_Black_Y[0] +50
			rectangle_roi = src(m_select);//對目標圖像進行裁剪保存
		}
		else if (x0 >= 0 && y0 >= 0 && w0 >= 0 && h0 >= 0
			&& x0+w0 <= src.cols
			&& y0+h0 <= src.rows)
		{
			Rect m_select(
				(x0),
				(y0),//開始的Y座標//Min_Y[0] + delta_title
				(w0),
				(h0));//終止的Y座標減去初始的Y座標 //Variable_Y_End[1] - Vertical_Black_Y[0] +50
			rectangle_roi = src(m_select);//對目標圖像進行裁剪保存
		}
		rois.push_back(rectangle_roi);
		rectangle(src, Point(x0 - 10, y0 - 10), Point(x0 + w0 + 10, y0 + h0 + 10), Scalar(0, 255, 0), 2, 8);
	}
	for (size_t i = 0; i < rois.size(); ++i)
	{
		//現在,可以執行所需的任何後期處理矩形/表格中的數據。
		std::stringstream ss;
		ss << "roi" << i << "";
		//imshow(ss.str(), rois[i]);
		string Img_Name = ".//All" + to_string(i) + ".jpg";
		imwrite(Img_Name, rois[i]);
		waitKey();
	}
	imshow("contours", src);
	waitKey(0);
	return rois;
}

接下來就是對提取到的表格進行處理。針對sobel算子輸出的圖像存在縱向的直線,首先進行3*3尺寸膨脹增強,其次構造檢測縱向直線結構元素,保證sobel算子輸出圖像中只包含文字區域。爲什麼要加入這個3*3膨脹呢,就是爲了讓sobel算子輸出的白色區域更爲明顯。通過對開源數據集中包含閉合區域的表格進行定位檢測。效果還是不錯的。看看我的代碼:

  • The next step is to process the extracted table.There are vertical lines in the images output by Sobel operator. Firstly, 3*3 dimensional expansion and enhancement are carried out. Secondly, longitudinal line structural elements are constructed to ensure that only text areas are included in the images output by Sobel operator.So why do we add this 3 by 3 expansion, just to make the white area of the sobel operator output more obvious.Through the open source data set containing closed areas of the table location detection.The results are still good.Look at my code:
Mat Imagepreprocess(Mat gray)
{
	if (!gray.data)
	{
		cerr << "Problem loading image!!!" << endl;
		exit(1);
	}
	//1.Sobel算子,x方向求梯度
	Mat sobel;
	Sobel(gray, sobel, CV_8U, 1, 0, 3);

	//2.OTSU 二值化
	Mat binary;
	threshold(sobel, binary, 0, 255, THRESH_OTSU + THRESH_BINARY);
	//--------------------------------------------------------------------------
	Mat kernel_gray = getStructuringElement(MORPH_RECT, Size(3, 3), Point(-1, -1));
	dilate(binary, binary, kernel_gray);
	//--------------------------------------------------------------------------
	Mat vertical_img = binary.clone();
	int scale = 10; //<----------------------->   需要動態計算這個閾值
	int verticalsize = vertical_img.rows / scale;
	// 創建用於通過形態學操作提取垂直線的結構元素
	Mat verticalStructure = getStructuringElement(MORPH_RECT, Size(1, verticalsize));
	// 應用形態學運算,先腐蝕再膨脹
	erode(vertical_img, vertical_img, verticalStructure, Point(-1, -1));
	dilate(vertical_img, vertical_img, verticalStructure, Point(-1, -1));
	Mat kernel = getStructuringElement(MORPH_RECT, Size(3, 3), Point(-1, -1));
	dilate(vertical_img, vertical_img, kernel);
	//----------------------------------------------------------------------
	//imshow("vertical_img", vertical_img);
	Mat OnlyTextimage = binary - vertical_img;
	//imshow("OnlyTextimage", OnlyTextimage);
	//--------------------------------------------------------------------------
	//3.膨脹和腐蝕操作核設定
	Mat element_1 = getStructuringElement(MORPH_RECT, Size(30, 9));
	//控制高度設置可以控制上下行的膨脹程度,例如3比4的區分能力更強,但也會造成漏檢
	Mat element_2 = getStructuringElement(MORPH_RECT, Size(24, 4));

	//4.膨脹一次,讓輪廓突出
	Mat dilate_1;
	dilate(OnlyTextimage, dilate_1, element_2);

	//5.腐蝕一次,去掉細節,表格線等。這裏去掉的是豎直的線
	Mat erode_1;
	erode(dilate_1, erode_1, element_1);

	//6.再次膨脹,讓輪廓明顯一些
	Mat dilate_2;
	dilate(erode_1, dilate_2, element_2);
	//--------------------------------------------------------------------------
	//Mat kernel_dilate = getStructuringElement(MORPH_RECT, Size(20, 10), Point(-1, -1));
	//dilate(dilate_2, dilate_2, kernel_dilate);
	//--------------------------------------------------------------------------
	//7.查看輸出結果
	imwrite("./title_time/output/binary.jpg", binary);
	imwrite("./title_time/output/dilate1.jpg", dilate_1);
	imwrite("./title_time/output/erode1.jpg", erode_1);
	imwrite("./title_time/output/dilate2.jpg", dilate_2);
	imwrite("./title_time/output/vertical_img.jpg", vertical_img);
	imwrite("./title_time/output/OnlyTextimage.jpg", OnlyTextimage);

	return dilate_2;
}

看看實驗效果:

  • the results:

Table extraction

one table: 

 

 

I hope I can help you,If you have any questions, please  comment on this blog or send me a private message. I will reply in my free time.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章