PyTesser
PyTesser is an Optical Character Recognition module for Python. It takes as input an image or image file and outputs a string.
PyTesser uses the Tesseract OCR engine, converting images to an accepted format and calling the Tesseract executable as an external script. A Windows executable is provided along with the Python scripts. The scripts should work in other operating systems as well.
這是官網的介紹,用法很簡單,下載,解壓,比如E:\QQDownload\pytesser_v0.0.1
打開命令行,cd到當前目錄,運行python,
>>> from pytesser import *
>>> image = Image.open('fnord.tif') # Open image object using PIL
>>> print image_to_string(image) # Run tesseract.exe on image
fnord
>>> print image_file_to_string('fnord.tif')
fnord
先試了下自帶的png圖片,確實識別出來了,然後又去12306上弄下來驗證碼圖片,直接啞火了,哎,用起來確實很簡單,可是這渣一樣的識別率。。。。。