使用AI技術獲取圖片文字與識別圖像內容

獲取圖片文字

如何使用python獲取圖片文字呢？

關注公衆號【輕鬆學編程】瞭解更多…

1、通過python的第三方庫pytesseract獲取

通過pip install pytesseract導入。

1.1、安裝tesseract-ocr

先在官網下載對應操作系統的tesseract-ocr ，比如我現在在windows系統下，就下載exe文件安裝，可點擊這裏下載，下載後運行exe後選擇一個目錄安裝，這個目錄需要記住，後面中需要用到，比如我的目錄爲D:\ruanjian\Tesseract-OCR。

1.2 下載訓練好的語言包

地址 ,這裏想提取圖片中的中文字，於是下載chi_sim.traineddata，下載到上面安裝tesseract-ocr目錄中的文件夾tessdata中，如圖：

1.3 代碼

import pytesseract
from PIL import Image

# 打開一張圖片
image = Image.open(r'images\82-望嶽.png')
pytesseract.pytesseract.tesseract_cmd = r'D:\ruanjian\Tesseract-OCR\tesseract.exe'
tessdata_dir_config = r'--tessdata-dir "D:\ruanjian\Tesseract-OCR\tessdata"'
# 提取中文，如果是提取英文，則先下載語言包，然後設置以下參數lang='eng'即可。
code = pytesseract.image_to_string(image, lang='chi_sim', config=tessdata_dir_config)

print(code)

比如我需要提取以下圖片文字：

處理結果：

這種方式優點就是可以無限次運行，只要配置好電腦環境就可以，缺點就是不能混語言。比如圖片中夾雜中文與英文，提取效果就不是很好。

2、使用百度接口

先到百度智能雲創建一個應用獲取APP_ID、API_KEY、SECRET_KEY

然後下載python的SDK，下載後使用pip install aip-python-sdk-2.2.15.zip安裝

import base64
import requests
import time
import ast
from aip import AipOcr

# https://console.bce.baidu.com/ai/#/ai/ocr/overview/index
""" 你的 APPID AK SK """
APP_ID = '你的'  
API_KEY = '你的'
SECRET_KEY = '你的'
# 百度api客戶端
CLIENT = AipOcr(APP_ID, API_KEY, SECRET_KEY)
# 請求頭
HEADERS = {
    'Content-Type': 'application/x-www-form-urlencoded'
}
# 獲取令牌的url
URL = 'https://aip.baidubce.com/oauth/2.0/token'
ACCESS_TOKEN = None
# 用於記錄獲取令牌的開始時間
SRART_TIME = time.time()

def get_file_content(filePath):
    # 獲取文件內容
    with open(filePath, 'rb') as fp:
        return fp.read()


def get_access_token():
    # 獲取令牌
    global ACCESS_TOKEN, SRART_TIME, URL
    response = requests.post(URL,
                             {'grant_type': 'client_credentials', 'client_id': API_KEY, 'client_secret': SECRET_KEY})
    ACCESS_TOKEN = ast.literal_eval(response.content.decode('utf-8'))['access_token']
    SRART_TIME = time.time()


def req_url(image):
    # 調用百度AI接口獲取圖像識別後的內容，調用接口次數爲每日5萬次
    global ACCESS_TOKEN, SRART_TIME, HEADERS

    if not ACCESS_TOKEN or (time.time() - SRART_TIME > 7000):
        get_access_token()
    response = requests.post('https://aip.baidubce.com/rest/2.0/ocr/v1/general_basic?access_token=%s' % ACCESS_TOKEN,
                             {'image': image}, headers=HEADERS)
    return response.content.decode('utf-8')


if __name__ == '__main__':
    # 圖片內容
    image = get_file_content(r'image\望嶽.png')
    # 獲取分析結果
    ret = req_url(base64.b64encode(image).decode())
    # 字符串轉字典
    ret = ast.literal_eval(ret)
    if 'words_result' in ret:
        for words in ret['words_result']:
            print(words['words'])

輸出：

使用場景

可以利用這些圖片識別給名片分類、獲取圖片上的關鍵信息、車牌識別等。

後記

【後記】爲了讓大家能夠輕鬆學編程，我創建了一個公衆號【輕鬆學編程】，裏面有讓你快速學會編程的文章，當然也有一些乾貨提高你的編程水平，也有一些編程項目適合做一些課程設計等課題。

也可加我微信【1257309054】，拉你進羣，大家一起交流學習。
如果文章對您有幫助，請我喝杯咖啡吧！

公衆號

156789.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L2xtX2lzX2Rj,size_16,color_FFFFFF,t_70)

關注我，我們一起成長~~

使用AI技術獲取圖片文字與識別圖像內容

獲取圖片文字

1、通過python的第三方庫pytesseract獲取

1.1、安裝tesseract-ocr

1.2 下載訓練好的語言包

1.3 代碼

2、使用百度接口

使用場景

後記

記一次 .NET某工業設計軟件崩潰分析

創建 Vue3 項目

TS + Webpack 整合 Jest

分享5款.NET開源免費的Redis客戶端組件庫

安卓手機如何登錄抖音境外版

golang開發 gorilla websocket的使用

面試官：如果不允許線程池丟棄任務，應該選擇哪個拒絕策略？

嵌入式汽車電子學習路線

Mac卸載 Node npm，升級 Node

uni.showModel內容換行

如何快速在windows上創建你的第一個odoo項目

基於Django的圖書推薦系統和論壇

python實現密碼破解

分四個階段學習python並找到一份好工作

這篇文章揭開python進程、線程、協程神祕的面紗

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結