OpenCV 文字检测与识别模块
该模块在扩展模块中,需自行下载
下载地址:https://github.com/opencv/opencv_contrib/tree/4.0.0
说明文档:
文字检测 https://docs.opencv.org/4.0.0/da/d56/group__text__detect.html
文字识别 https://docs.opencv.org/4.0.0/d8/df2/group__text__recognize.html
参考文章:https://github.com/opencv/opencv_contrib/blob/master/modules/text/samples/text_recognition_cnn.cpp
https://github.com/opencv/opencv_contrib/blob/master/modules/text/samples/webcam_demo.cpp
文字检测
1.textDetectorCNN
OpenCV的文字检测模块textDetectorCNN中使用了TextBoxes:具有单个深度神经网络的快速文本检测器 链接地址为: https://github.com/MhLiao/TextBoxes
其中已经训练过的文件:
函数名 | 内容 | 地址 |
---|---|---|
modelWeightsFilename | 描述分类器体系结构的prototxt文件的相对或绝对路径。 | textbox.prototxt 在下载的扩展模块源码中opencv_contrib/modules/text/samples/textbox.prototxt |
modelWeightsFilename | 包含caffe-binary形式的模型的预训练权重的文件的相对或绝对路径。 | TextBoxes_icdar13.caffemodel http://pan.baidu.com/s/1qY73XHq |
cv::Mat temp;
src.convertTo(temp, CV_8UC3, 1);//src 输入图像
cv::imshow("src", temp);
dst1 = temp.clone();
cv::Ptr<cv::text::TextDetectorCNN> detector= cv::text::TextDetectorCNN::create("textbox.prototxt", "TextBoxes_icdar13.caffemodel");
std::vector < cv::Rect > boxes;//识别区域
std::vector < float > sources;//评估分数
detector->detect(temp, boxes, sources);
float threshold = 0.5;
for (int i = 0; i < boxes.size(); i++)
{
if (sources[i] > threshold)
{
cv::Rect rect = boxes[i];
cv::rectangle(dst1, rect, cv::Scalar(255, 0, 0), 2);
}
}
cv::imshow("Text detection result", dst1);
原图
结果
2.erGrouping
文字识别
1.OCRHolisticWordRecognizer
OCRHolisticWordRecognizer类提供了分段词语的功能。给定预定义的词汇表,使用DictNet来选择给定输入图像的最可能的词。
DictNet详细描述于:Max Jaderberg等:使用卷积神经网络阅读野外文本,IJCV 2015 http://arxiv.org/abs/1412.1842
模型文件下载地址: http://nicolaou.homouniversalis.org/assets/vgg_text/dictnet_vgg.caffemodel
http://nicolaou.homouniversalis.org/assets/vgg_text/dictnet_vgg_deploy.prototxt
http://nicolaou.homouniversalis.org/assets/vgg_text/dictnet_vgg_labels.txt
wordSpotter = (cv::text::OCRHolisticWordRecognizer::create("dictnet_vgg_deploy.prototxt", "dictnet_vgg.caffemodel", "dictnet_vgg_labels.txt"));
dst1 = src.clone();
for (size_t i = 0; i < textBoxes.size(); i++)
{
cv::Mat wordImg;
cv::cvtColor(src(textBoxes[i]), wordImg, cv::COLOR_BGR2GRAY);
std::string word;
std::vector<float> confs;
wordSpotter->run(wordImg, word, NULL, NULL, &confs);//检测
cv::Rect currrentBox = textBoxes[i];
cv::rectangle(dst1, currrentBox, cv::Scalar(0, 255, 255), 2, cv::LINE_AA);
int baseLine = 0;
cv::Size labelSize = cv::getTextSize(word, cv::FONT_HERSHEY_PLAIN, 1, 1, &baseLine);
int yLeftBottom = currrentBox.y>labelSize.height? currrentBox.y: labelSize.height;
cv::rectangle(dst1, cv::Point(currrentBox.x, yLeftBottom - labelSize.height),
cv::Point(currrentBox.x + labelSize.width, yLeftBottom + baseLine), cv::Scalar(255, 255, 255), cv::FILLED);
cv::putText(dst1, word, cv::Point(currrentBox.x, yLeftBottom), cv::FONT_HERSHEY_PLAIN, 1, cv::Scalar(0, 0, 0), 1, cv::LINE_AA);
}
cv::imshow("Text recognition", dst1);
结果