驗證碼預處理

前言

今天看到了一個好東西，和大家分享一下，順便翻譯翻譯。
github源碼：https://github.com/Vykstorm/CaptchaDL
kaggle地址：https://www.kaggle.com/vykstorm/extracting-words-from-images-with-opencv-part-2

具體就是對驗證碼做預處理，讓我覺得是好東西的是驗證碼的切割部分。驗證碼樣本：

這種驗證碼使用一些簡單的技巧是無法切割的，而這個大佬用OpenCV做到了，並且切割效果比較理想。

kaggle可以下載到jupyter的筆記和代碼，只需要在本地創建個python的虛擬環境(推薦conda)，裝上github的requeriments.txt裏的包就可以本地測試效果了。（當我們接觸一個新東西的時候，先不要去着急的理解原理，我們可以先使用別人的成果來滿足自己的好奇心。比如將他的代碼全部運行一遍得到了最終的效果，這在後面一步一步的分析的時候纔會更有幹勁，因爲你知道終點你也可以做到，那麼過程累一點也無所謂。）

jupyter內容翻譯

導入相關庫

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import cv2 as cv
import pickle
import warnings
from itertools import product, repeat, permutations, combinations_with_replacement, chain
from math import floor, ceil

warnings.filterwarnings('ignore')

%matplotlib inline

加載數據集

如果kaggle下載慢的可以下載這個：https://download.csdn.net/download/Qwertyuiop2016/12575444（免積分）
因爲jupyter上加載的是kaggle網站上的，我想本地測試就將驗證碼下載下來本地加載

import numpy as np
import os
import random
from PIL import Image

img_dir = 'I:/samples/samples' # 驗證碼路徑
os.chdir(img_dir)
width, height, img_num = 200, 50, 1000
# 本來這樣選的目的是爲了能指定數量和打亂順序，不過在這裏沒什麼用，因爲不訓練模型
imgs = random.sample(os.listdir(), img_num) 
X = np.zeros((len(imgs), height, width, 1), dtype = np.uint8)  # 這種維度只是爲了適應TensorFlow的圖片輸入格式
for index, img_name in enumerate(imgs):
    img = Image.open(img_name)
    img_gray = img.convert('L')  # 轉換爲灰度圖
    pix = np.array(img_gray)
    pix = pix.reshape((height, width, 1))  # 將維度爲(height, width)轉爲(height, width, 1)
    X[index] = pix

顯示灰度圖

img = X[1][:,:,0]
plt.imshow(img, cmap='gray')

反轉黑白色

inverted = 255 - img
plt.imshow(inverted, cmap='gray');

二值化圖片

ret, thresholded = cv.threshold(inverted, 140, 255, cv.THRESH_BINARY)
plt.imshow(thresholded, cmap='gray');

inverted是圖片灰度化的數組
其中140是二值化的閾值，可以由迭代法和otsu算法得到，具體參考我以前的博客：驗證碼之二值化
255爲圖片像素的最大值
cv.THRESH_BINARY表示大於閾值設爲255(就是第三個參數的值)，小於閾值設爲0，這也就是通常所說的二值化

閾值可以自己用實現算法計算出來，其實OpenCV也內置了otsu算法。實現如下：

ret2,th2 = cv.threshold(inverted,0,255,cv.THRESH_BINARY+cv.THRESH_OTSU)
plt.imshow(th2, cmap='gray')

利用中值濾波簡單去噪點和干擾線

blurred = cv.medianBlur(thresholded, 3)
plt.imshow(blurred, cmap='gray')

第二個參數表示濾波模板的尺寸，值必須爲大於1的奇數。在驗證碼處理中一般爲3或者5，太大容易消除驗證碼特徵。
值爲3時：

值爲5時：

我看kaggle上那位大佬選的值爲3，但我看值爲5時效果更佳。不過在下一步操作後，其實兩個得到的結果差不了太多，感覺這一步只是順帶的，並不重要。

形態學操作消除噪點和干擾線

形態學操作：腐蝕、膨脹、開運算、閉運算等

首先進行開運算：

kernel = np.array([
    [0, 0, 1, 0, 0],
    [0, 0, 1, 0, 0],
    [0, 0, 1, 0, 0],
    [0, 0, 1, 0, 0],
    [0, 0, 1, 0, 0],
]).astype(np.uint8)

ex = cv.morphologyEx(blurred, cv.MORPH_OPEN, kernel)
plt.imshow(ex, cmap='gray');

cv.MORPH_OPEN表示開運算（先腐蝕後膨脹），kernel的選擇我搜不到相關資料。不過我換全0的效果也差不多，甚至改成全1的也是一樣，我又繼續試了3x3的全0和全1或者中間爲1，同樣看不出太大的區別。希望有懂的大佬能說一下。
效果圖：

接着在上面操作完的圖片在進行膨脹：

kernel2 = np.array([
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]

]).astype(np.uint8)

ex2 = cv.morphologyEx(ex, cv.MORPH_DILATE, kernel2)
plt.imshow(ex2, cmap='gray');

對blurred的那張圖和膨脹後的圖像進行與運算

與運算：即對圖像（灰度圖像或彩色圖像均可）每個像素值進行二進制“與”操作，1&1=1，1&0=0，0&1=0，0&0=0

mask = ex2
processed = cv.bitwise_and(mask, blurred)
plt.imshow(processed, cmap='gray')

尋找輪廓線

contours, hierachy = cv.findContours(processed, cv.RETR_CCOMP, cv.CHAIN_APPROX_SIMPLE)
contours = [contours[k] for k in range(0, len(contours)) if hierachy[0, k, 3] == -1]
contours.sort(key=lambda cnt: cv.boundingRect(cnt)[0])
plt.imshow(cv.drawContours(cv.cvtColor(img, cv.COLOR_GRAY2RGB), contours, -1, (255, 0, 0), 1, cv.LINE_4));

針對找出的輪廓線畫出矩形框

contour_bboxes = [cv.boundingRect(contour) for contour in contours]
img_bboxes = cv.cvtColor(img, cv.COLOR_GRAY2RGB)
for bbox in contour_bboxes:
	left, top, width, height = bbox
	img_bboxes = cv.rectangle(img_bboxes,
							(left, top), (left+width, top+height),
							, 255, 0), 1)
plt.imshow(img_bboxes, cmap='gray');

畫了兩個框的原因是上一步找出了兩條輪廓線，從返回的contours這個列表有幾個元素可以看出，即有len(contours)個框.

訓練一個分類器來識別每個框框有多少個字符

特徵有：框寬度、框高度、框面積、框面積/(框高度*框寬度)、框周長

我們訓練一個根據上面五個特徵來得到框框有幾個字符的分類器，大佬使用的是SVC分類器。不過我並沒有找到分類器代碼的實現，只有一個已經訓練好的分類器。
提取特徵：

contours_features = pd.DataFrame.from_dict({
	'bbox_width': [bbox[2] for bbox in contour_bboxes],
	'bbox_height': [bbox[3] for bbox in contour_bboxes],
	'area': [cv.contourArea(cnt) for cnt in contours],
	'extent': [cv.contourArea(cnt) / (bbox[2] * bbox[3]) for cnt, bbox in zip(contours, contour_bboxes)],
	'perimeter': [cv.arcLength(cnt, True) for cnt in contours]
})

加載已經訓練好的分類器：https://github.com/Vykstorm/CaptchaDL/blob/master/models/.contour-classifier

with open('I:/contour-classifier', 'rb') as file:
    contour_classifier = pickle.load(file)

對數據進行標準化操作（削弱值特別大的特徵對結果的影響）：
https://github.com/Vykstorm/CaptchaDL/blob/master/models/.contour-classifier-preprocessor

with open('I:/contour-classifier-preprocessor', 'rb') as file:
	contour_features_scaler = pickle.load(file)
contour_features = contour_features_scaler.transform(contours_features[['bbox_width', 'bbox_height', 'area', 'extent', 'perimeter']])
# 得到的contour_features：
#array([[ 2.1661931 ,  1.40786863,  2.87483795,  0.11734141,  1.81692393],
#       [-0.62741894, -0.37829382, -0.6341117 ,  0.72145767, -0.6275891 ]])

預測結果：

contour_num_chars = contour_classifier.predict(contour_features)`
# array([4, 1], dtype=uint8)

符合我們人眼看到的第一個框四個字符，第二個框一個字符。

後面的一些操作就不解釋了，就是將包含多個字符的框等比例切割，然後再將切割後每個字符擴充到同樣的大小。

驗證碼預處理

前言

jupyter內容翻譯

導入相關庫

加載數據集

顯示灰度圖

反轉黑白色

二值化圖片

利用中值濾波簡單去噪點和干擾線

形態學操作消除噪點和干擾線

對blurred的那張圖和膨脹後的圖像進行與運算

尋找輪廓線

針對找出的輪廓線畫出矩形框

訓練一個分類器來識別每個框框有多少個字符

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

free AI online tools All In One

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU啓動那些事（12.A）- uSDHC eMMC啓動時間(RT1170)

linux安裝cuda和cudnn

Mellanox網卡開啓SR-IOV

模擬手機設備：使用 Playwright 實現移動端自動化測試

HTML 00 Tutorial

全面系統的AI學習路徑，幫助普通人也能玩轉AI

從零開始：使用 Playwright 腳本錄製實現自動化測試

uni-app實現上拉加載

驗證碼預處理

圖片數據集持久化保存(序列化)

驗證碼識別之連體字符切割

Windows10安裝TensorFlow-gpu

selenium如何連接已經打開的瀏覽器

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結