環境

Python3.6
Pycharm
Opencv-python
pillow
Pytesseract
Tesseract-OCR

環境配置可以參考：

python、opencv的安裝配置
 pil、tesseract的安裝配置

識別的二維碼：

圖像預處理：

1.驗證碼圖片(濾波)灰度化

2.二值化（使用自適應閾值的方法）

3.形態學操作(膨脹腐蝕開閉…)
得到白色背景較乾淨的圖片

字符識別結果：

可以檢測一些規則字體的圖片，試一試其他的稍微難點的驗證碼，效果不是那麼好，特別是一下字體。手寫字體效果也不好。可以對Tesseract重新訓練一些樣本進行相對應圖片的識別。
Tesseract的訓練

代碼：

import cv2 as cv
import numpy as np
from PIL import Image
import pytesseract as tess

def recognize_text():
    gray = cv.cvtColor(src, cv.COLOR_BGR2GRAY)
    cv.imshow("gray", gray)
    # 固定閾值二值化
    ret, binary = cv.threshold(gray, 150, 255, cv.THRESH_BINARY)
    # 自適應二值化
    # binary = cv.adaptiveThreshold(gray, 255, cv.ADAPTIVE_THRESH_MEAN_C, cv.THRESH_BINARY, 25, 10)
    cv.imshow("binary", binary)
    
    # 形態學的處理，濾除噪點
    kernel = cv.getStructuringElement(cv.MORPH_RECT, (3, 1))
    dilate_image = cv.dilate(binary, kernel)
    kernel = cv.getStructuringElement(cv.MORPH_RECT, (3, 3))
    erode_image = cv.erode(dilate_image, kernel)
    cv.imshow("erode_image", erode_image)
    
    # 將dilate_image轉爲Image
    textImage = Image.fromarray(erode_image)
    # 識別
    txt = tess.image_to_string(textImage)
    print("識別結果爲：", txt)


print("驗證碼識別：")
src = cv.imread("G:\\Python\\verification\\4.png")
cv.imshow("src", src)
recognize_text()

cv.waitKey(0)
cv.destroyAllWindows()

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

python、Tesseract的簡易驗證碼的識別

環境

環境配置可以參考：

圖像預處理：

字符識別結果：

伺服電機&旋轉變壓器&光電編碼器

基於Tesseract-OCR的空調外包裝表面的字符識別

python、Tesseract的簡易驗證碼的識別

01_uboot的介紹、作用。

NRF51822裸機GPIO學習筆記

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結