Python 識別驗證碼數字

問題：識別驗證碼中的運算
最終識別效果不好，仍在想方法

環境：
1. Ubuntu 16.04
2. Python 3.5.2

使用了百度的OCR（optical character recognition）接口，發現無法識別驗證碼中的數字，無法使用。

1.安裝項目依賴

#1.tesseract-ocr安裝 
sudo apt-get install tesseract-ocr

#2.pytesseract安裝 
sudo apt-get install python3-pip
sudo pip3 install pytesseract

#3.Pillow 安裝 
sudo pip3 install pillow

#4.numpy 安裝
sudo pip3 install numpy

#5.openCV
pip install opencv-python

# -*- coding: UTF-8 -*_
from PIL import Image
from PIL import ImageFilter
from pytesseract import *
import PIL.ImageOps



def initTable(threshold=140):
    table = []
    for i in range(256):
        if i < threshold:
            table.append(0)
        else:
            table.append(1)
    return table

im = Image.open('3.png')
#圖片的處理過程
im = im.convert('L')
binaryImage = im.point(initTable(), '1')
im1 = binaryImage.convert('L')
im2 = PIL.ImageOps.invert(im1)
im2.save('test_1.png');
im3 = im2.convert('1')
im2.save('test_2.png');
im4 = im3.convert('L')
im2.save('test_3.png');
print(pytesseract.image_to_string(im2));
#將圖片中字符裁剪保留
box = (120,0,265,60) 
region = im4.crop(box)  
region.save('test_4.png');

fc = region.crop((0,0,48,60));
fc.save('test_fc.png');
op = region.crop((48,0,68,60));
op.save('test_op.png');
region.save('test_region.png');
lc = region.crop((68,0,120,60));
lc.save('test_lc.png');
print(pytesseract.image_to_string(fc, config='-psm 6  -c tessedit_char_whitelist="0123456789"'));
print(pytesseract.image_to_string(op, config='-psm 7  -c tessedit_char_whitelist="+-*/%"'));
print(pytesseract.image_to_string(lc, config='-psm 6  -c tessedit_char_whitelist="0123456789"'));

invImg = PIL.ImageOps.invert(region);
invImg.save('test_5.png');
print(pytesseract.image_to_string(invImg, config='-psm 7  -c tessedit_char_whitelist="0123456789+-*/"'));

invIm2 = invImg.convert('RGB').filter(ImageFilter.SHARPEN);
invIm2.save('test_6.jpg');
#將圖片字符放大
out = region.resize((120,38)) 
out.save('test_7.png');
# out.save('test_6.png');
asd = pytesseract.image_to_string(out)
print(asd)
# print(out.show())

最終效果：

其他

今天在使用Python的Pillow模塊：

from PIL import Image  
from PIL import ImageFilter
im=Image.open("google-logo.bmp")  
out=im.filter(ImageFilter.DETAIL)

出現錯誤：
ValueError: cannot filter palette images

print (im.format, im.size,im.mode)  

GIF (150, 55) P

P代表是調色板。
在StackOverflow上搜索了下，找到了該問題：
http://stackoverflow.com/questions/10323692/cannot-filter-palette-images-error-when-doing-a-imageenhance-sharpness
It’s quite common for algorithms to be unable to work with a palette based image. The convert in the above changes it to have a full RGB value at each pixel location.
大致翻譯如下：在算法中，不能處理一個調色板圖像很正常。這種“轉換”需要在每一個像素點有全RGB值。

本例子中，改爲：

out=im.convert('RGB').filter(ImageFilter.DETAIL)

常用濾鏡有如下：
濾鏡名稱含義

ImageFilter.BLUR    模糊濾鏡
ImageFilter.CONTOUR 輪廓
ImageFilter.EDGE_ENHANCE    邊界加強
ImageFilter.EDGE_ENHANCE_MORE   邊界加強(閥值更大)
ImageFilter.EMBOSS  浮雕濾鏡
ImageFilter.FIND_EDGES  邊界濾鏡
ImageFilter.SMOOTH  平滑濾鏡
ImageFilter.SMOOTH_MORE 平滑濾鏡(閥值更大)
ImageFilter.SHARPEN 銳化濾鏡


im1 = im.filter(ImageFilter.BLUR)
im2 = im.filter(ImageFilter.MinFilter(3))
im3 = im.filter(ImageFilter.MinFilter()) # same as MinFilter(3)

Tesseract 參數

root@1fc22f000bc8:/var/www/tmp# tesseract
Usage:
  tesseract --help | --help-psm | --version
  tesseract --list-langs [--tessdata-dir PATH]
  tesseract --print-parameters [options...] [configfile...]
  tesseract imagename|stdin outputbase|stdout [options...] [configfile...]

OCR options:
  --tessdata-dir PATH   Specify the location of tessdata path.
  --user-words PATH     Specify the location of user words file.
  --user-patterns PATH  Specify the location of user patterns file.
  -l LANG[+LANG]        Specify language(s) used for OCR.
  -c VAR=VALUE          Set value for config variables.
                        Multiple -c arguments are allowed.
  -psm NUM              Specify page segmentation mode.
NOTE: These options must occur before any configfile.

Page segmentation modes:
  0    Orientation and script detection (OSD) only.
  1    Automatic page segmentation with OSD.
  2    Automatic page segmentation, but no OSD, or OCR.
  3    Fully automatic page segmentation, but no OSD. (Default)
  4    Assume a single column of text of variable sizes.
  5    Assume a single uniform block of vertically aligned text.
  6    Assume a single uniform block of text.
  7    Treat the image as a single text line.
  8    Treat the image as a single word.
  9    Treat the image as a single word in a circle.
 10    Treat the image as a single character.

Single options:
  -h, --help            Show this help message.
  --help-psm            Show page segmentation modes.
  -v, --version         Show version information.
  --list-langs          List available languages for tesseract engine.
  --print-parameters    Print tesseract parameters to stdout.