驗證碼識別

1、前言

工作關係，在做自動化測試的時候，不可避免要碰到驗證碼，如果中途暫停手動輸入的話，未免太繁瑣，所以我在這裏總結了自己搜索到的資料，結合實踐經驗，與各位分享。

2、解決的問題

本次我解決的問題主要是比較傳統的圖片驗證碼識別，類似下圖這樣的：

滑塊驗證和順序點擊圖片那種逆天的驗證碼本次不涉及。

3、方法

我這裏有java和python的不同實現，背後的思路大體一致：

① 圖片二值化

② 去噪點

③ 識別

下面通過代碼給大家講解，相關代碼已上傳至github，可在文末查看。

4、java實現

首先列出工程目錄：

Entrance是程序入口，DT是一些配置信息，PictureOcr是識別用到的一些方法。

① 去噪點

 1     /**
 2      * 圖片去噪點
 3      * @param picPath
 4      * @return
 5      * @throws IOException
 6      */
 7     public static void removeBackground(String picPath) throws IOException {
 8         BufferedImage bufferedImage = ImageIO.read(new File(picPath));
 9         int width = bufferedImage.getWidth();
10         int height = bufferedImage.getHeight();
11         for (int x = 0; x < width; ++x) {
12             for (int y = 0; y < height; ++y) {
13                 if (isWrite(bufferedImage.getRGB(x, y)) == 1) {
14                     bufferedImage.setRGB(x, y, Color.white.getRGB());
15                 } else {
16                     bufferedImage.setRGB(x, y, Color.black.getRGB());
17                 }
18             }
19         }
20         ImageIO.write(bufferedImage, picType, new File(picPath));
21     }

 1     /**
 2      * 如果某個像素的三原色值大於所設定的閾值，就將此像素設爲白色，即爲背景
 3      * @param colorInt
 4      * @return
 5      */
 6     public static int isWrite(int colorInt) {
 7 
 8         Color color = new Color(colorInt);
 9         if (color.getRed() + color.getGreen() + color.getBlue() > DT.DictOfOcr.threshold) {
10             return 1;
11         }
12         return 0;
13     }

先取得圖片的分辨率（長 * 寬），然後設定一個閾值，閾值就是某個像素的R，G，B三原色值的和，大家可以使用截圖工具來分析要識別圖像的驗證碼閾值是多少，以微信爲例，驗證碼待識別區域的RGB值即可設定爲閾值，大於此閾值的像素均設爲白色，否則即設爲黑色，這樣便可以有效去除噪點。

② 裁剪邊框

裁剪邊框是爲了儘可能大的保留圖片特徵，提高識別率

 1     /**
 2      * 裁剪邊角
 3      * @param picPath
 4      * @throws IOException
 5      */
 6     public static void cutPic(String picPath) throws IOException {
 7 
 8         BufferedImage bufferedimage=ImageIO.read(new File(picPath));
 9         int width = bufferedimage.getWidth();
10         int height = bufferedimage.getHeight();
11 
12 
13         bufferedimage = cropPic(bufferedimage, (cutWidth / 2),0, (width - cutWidth / 2), height);
14         bufferedimage = cropPic(bufferedimage,0, (cutHeight / 2),(width - cutWidth), (height - cutHeight / 2));
15         ImageIO.write(bufferedimage, picType, new File(picPath));
16     }
17 
18     /**
19      * 根據參數裁剪圖片
20      * @param bufferedImage
21      * @param startX
22      * @param startY
23      * @param endX
24      * @param endY
25      * @return
26      */
27     public static BufferedImage cropPic(BufferedImage bufferedImage, int startX, int startY, int endX, int endY) {
28         int width = bufferedImage.getWidth();
29         int height = bufferedImage.getHeight();
30         if (startX == -1) {
31             startX = 0;
32         }
33         if (startY == -1) {
34             startY = 0;
35         }
36         if (endX == -1) {
37             endX = width - 1;
38         }
39         if (endY == -1) {
40             endY = height - 1;
41         }
42         BufferedImage result = new BufferedImage(endX - startX, endY - startY, 4);
43         for (int x = startX; x < endX; ++x) {
44             for (int y = startY; y < endY; ++y) {
45                 int rgb = bufferedImage.getRGB(x, y);
46                 result.setRGB(x - startX, y - startY, rgb);
47             }
48         }
49         return result;
50     }

③ 執行OCR識別

 1     /**
 2      * 執行Ocr識別
 3      * @param picPath
 4      * @return
 5      * @throws TesseractException
 6      */
 7     public static String executeOcr(String picPath) throws TesseractException {
 8 
 9         ITesseract iTesseract = new Tesseract();
10         iTesseract.setDatapath(tessdataPath);
11         //iTesseract.setLanguage("eng");
12         //可根據需要引入相關的訓練集
13         String ocrResult = iTesseract.doOCR(new File(picPath));
14         return ocrResult;
15     }

用到了tessdata數據包

④ 效果

對於規範的驗證碼來說，識別率還是很不錯的，80%左右。我在工程resources路徑下建立了一個image文件夾，裏面有些圖片，大家可以自行嘗試。

5、python實現

思路如下：

構建一定數量的數據集（被打上標籤的驗證碼圖片），然後進行模型的訓練：

1、二值化圖片

2、分割並保存每一張圖片中的字符

3、“提取分割出的中的特徵值”

4、生成訓練集

5、定義分類模型

6、測試分類效果

 1 def capt_process(capt):
 2     """
 3     圖像預處理，將驗證碼圖片轉爲二值型圖片，按字符切割
 4     :param capt: image
 5     :return: 一個數組包含四個元素，每個元素是一張包含單個字符的二值型圖片
 6     """
 7     # 轉爲灰度圖
 8     capt_gray = capt.convert("L")
 9     # 取得圖片閾值
10     threshold = get_threshold(capt_gray)
11     # 二值化圖片
12     table = get_bin_table(threshold)
13     capt_bw = capt_gray.point(table, "1")
14     capt_per_char_list = []
15     for i in range(4):
16         x = 5 + i * 15
17         y = 2
18         capt_per_char = capt_bw.crop((x, y, x + 13, y + 24))
19         capt_per_char_list.append(capt_per_char)
20 
21     return capt_per_char_list

View Code

 1 def get_threshold(capt):
 2     """
 3     獲取一張圖片中，像素出現次數最多的像素，作爲閾值
 4     :param capt:
 5     :return:
 6     """
 7     pixel_dict = defaultdict(int)
 8     # 取得圖片長、寬
 9     rows, cols = capt.size
10     for i in range(rows):
11         for j in range(cols):
12             # 取得這一點的(r,g,b)
13             pixel = capt.getpixel((i, j))
14             # 以像素做key，出現的次數做value
15             pixel_dict[pixel] += 1
16     # 取得字典中像素出現最多的次數
17     count_max = max(pixel_dict.values())
18     # 反轉字典，改爲以出現次數做key，方便後面取得像素
19     pixel_dict_reverse = {v: k for k, v in pixel_dict.items()}
20     # 取得出現次數最多的像素
21     threshold = pixel_dict_reverse[count_max]
22     return threshold

View Code

 1 def get_bin_table(threshold):
 2     """
 3     按照閾值進行二值化處理
 4     :param threshold:
 5     :return:
 6     """
 7     table = []
 8     rate = 0.1
 9     for i in range(256):
10         if threshold * (1 - rate) <= i <= threshold * (1 + rate):
11             table.append(1)
12         else:
13             table.append(0)
14     return table

View Code

代碼裏都有註釋，就不詳細解釋了。

本人python實現大量參考了這篇博文：

https://blog.csdn.net/weixin_38641983/article/details/80899354

具體每一步怎麼做的，爲什麼這麼做，都有清楚地解釋，我在這裏不再贅述，感謝這位博主。

我要說明的是，訓練集可能每個人都不一樣，圖片切割尺寸也可能都不一樣，這些需要在使用時隨機應變。

6、結語

以上提供的方法只能識別簡單的驗證碼，但是它爲我們提供了一種問題解決範式，今後遇到類似的問題，不至於手忙腳亂。

相關代碼還參考了以下博文：

https://segmentfault.com/a/1190000015240294?utm_source=tag-newest

https://blog.csdn.net/problc/article/details/5794460

再次感謝以上博主。

本文相關代碼已上傳至github，有問題歡迎與我交流。

https://github.com/Thinker-Mars/Demo/tree/master/picture-ocr

驗證碼識別

驗證碼識別

1、前言

2、解決的問題

3、方法

4、java實現

5、python實現

6、結語

三年的一些思考

（三十）運輸層--總結

（二十七）運輸層--TCP的擁塞控制方法

（二十九）運輸層--TCP的運輸連接管理

（二十八）運輸層--主動隊列管理AQM

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結