接口說明

更新時間：2020-04-06

通用文字識別

用戶向服務請求識別某張圖中的所有文字

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用通用文字識別, 圖片參數爲本地圖片 """
client.basicGeneral(image);

""" 如果有可選參數 """
options = {}
options["language_type"] = "CHN_ENG"
options["detect_direction"] = "true"
options["detect_language"] = "true"
options["probability"] = "true"

""" 帶參數調用通用文字識別, 圖片參數爲本地圖片 """
client.basicGeneral(image, options)

url = "https//www.x.com/sample.jpg"

""" 調用通用文字識別, 圖片參數爲遠程url圖片 """
client.basicGeneralUrl(url);

""" 如果有可選參數 """
options = {}
options["language_type"] = "CHN_ENG"
options["detect_direction"] = "true"
options["detect_language"] = "true"
options["probability"] = "true"

""" 帶參數調用通用文字識別, 圖片參數爲遠程url圖片 """
client.basicGeneralUrl(url, options)

通用文字識別請求參數詳情

參數名稱	是否必選	類型	可選值範圍	默認值	說明
image	是	string			圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式
url	是	string			圖片完整URL，URL長度不超過1024字節，URL對應的圖片base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式，當image字段存在時url字段失效
language_type	否	string	CHN_ENG ENG POR FRE GER ITA SPA RUS JAP KOR	CHN_ENG	識別語言類型，默認爲CHN_ENG。可選值包括： - CHN_ENG：中英文混合； - ENG：英文； - POR：葡萄牙語； - FRE：法語； - GER：德語； - ITA：意大利語； - SPA：西班牙語； - RUS：俄語； - JAP：日語； - KOR：韓語；
detect_direction	否	string	true false	false	是否檢測圖像朝向，默認不檢測，即：false。朝向是指輸入圖像是正常方向、逆時針旋轉90/180/270度。可選值包括: - true：檢測朝向； - false：不檢測朝向。
detect_language	否	string	true false	false	是否檢測語言，默認不檢測。當前支持（中文、英語、日語、韓語）
probability	否	string	true false		是否返回識別結果中每一行的置信度

通用文字識別返回數據參數詳情

字段	必選	類型	說明
direction	否	number	圖像方向，當detect_direction=true時存在。 - -1:未定義， - 0:正向， - 1: 逆時針90度， - 2:逆時針180度， - 3:逆時針270度
log_id	是	number	唯一的log id，用於問題定位
words_result_num	是	number	識別結果數，表示words_result的元素個數
words_result	是	array	定位和識別結果數組
+words	否	string	識別結果字符串
probability	否	object	行置信度信息；如果輸入參數 probability = true 則輸出
+average	否	number	行置信度平均值
+variance	否	number	行置信度方差
+min	否	number	行置信度最小值

通用文字識別返回示例

{
"log_id": 2471272194,
"words_result_num": 2,
"words_result":
    [
        {"words": " TSINGTAO"},
        {"words": "青島睥酒"}
    ]
}

通用文字識別（高精度版）

用戶向服務請求識別某張圖中的所有文字，相對於通用文字識別該產品精度更高，但是識別耗時會稍長。

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用通用文字識別（高精度版） """
client.basicAccurate(image);

""" 如果有可選參數 """
options = {}
options["detect_direction"] = "true"
options["probability"] = "true"

""" 帶參數調用通用文字識別（高精度版） """
client.basicAccurate(image, options)

通用文字識別（高精度版）請求參數詳情

參數名稱	是否必選	類型	可選值範圍	默認值	說明
image	是	string			圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式
detect_direction	否	string	true false	false	是否檢測圖像朝向，默認不檢測，即：false。朝向是指輸入圖像是正常方向、逆時針旋轉90/180/270度。可選值包括: - true：檢測朝向； - false：不檢測朝向。
probability	否	string	true false		是否返回識別結果中每一行的置信度

通用文字識別（高精度版）返回數據參數詳情

字段	必選	類型	說明
direction	否	number	圖像方向，當detect_direction=true時存在。 - -1:未定義， - 0:正向， - 1: 逆時針90度， - 2:逆時針180度， - 3:逆時針270度
log_id	是	number	唯一的log id，用於問題定位
words_result_num	是	number	識別結果數，表示words_result的元素個數
words_result	是	array	定位和識別結果數組
+words	否	string	識別結果字符串
probability	否	object	行置信度信息；如果輸入參數 probability = true 則輸出
+average	否	number	行置信度平均值
+variance	否	number	行置信度方差
+min	否	number	行置信度最小值

通用文字識別（高精度版）返回示例

參考通用文字識別返回示例

通用文字識別（含位置信息版）

用戶向服務請求識別某張圖中的所有文字，並返回文字在圖中的位置信息。

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用通用文字識別（含位置信息版）, 圖片參數爲本地圖片 """
client.general(image);

""" 如果有可選參數 """
options = {}
options["recognize_granularity"] = "big"
options["language_type"] = "CHN_ENG"
options["detect_direction"] = "true"
options["detect_language"] = "true"
options["vertexes_location"] = "true"
options["probability"] = "true"

""" 帶參數調用通用文字識別（含位置信息版）, 圖片參數爲本地圖片 """
client.general(image, options)

url = "https//www.x.com/sample.jpg"

""" 調用通用文字識別（含位置信息版）, 圖片參數爲遠程url圖片 """
client.generalUrl(url);

""" 如果有可選參數 """
options = {}
options["recognize_granularity"] = "big"
options["language_type"] = "CHN_ENG"
options["detect_direction"] = "true"
options["detect_language"] = "true"
options["vertexes_location"] = "true"
options["probability"] = "true"

""" 帶參數調用通用文字識別（含位置信息版）, 圖片參數爲遠程url圖片 """
client.generalUrl(url, options)

通用文字識別（含位置信息版）請求參數詳情

參數名稱	是否必選	類型	可選值範圍	默認值	說明
image	是	string			圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式
url	是	string			圖片完整URL，URL長度不超過1024字節，URL對應的圖片base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式，當image字段存在時url字段失效
recognize_granularity	否	string	big - 不定位單字符位置 small - 定位單字符位置	small	是否定位單字符位置，big：不定位單字符位置，默認值；small：定位單字符位置
language_type	否	string	CHN_ENG ENG POR FRE GER ITA SPA RUS JAP KOR	CHN_ENG	識別語言類型，默認爲CHN_ENG。可選值包括： - CHN_ENG：中英文混合； - ENG：英文； - POR：葡萄牙語； - FRE：法語； - GER：德語； - ITA：意大利語； - SPA：西班牙語； - RUS：俄語； - JAP：日語； - KOR：韓語；
detect_direction	否	string	true false	false	是否檢測圖像朝向，默認不檢測，即：false。朝向是指輸入圖像是正常方向、逆時針旋轉90/180/270度。可選值包括: - true：檢測朝向； - false：不檢測朝向。
detect_language	否	string	true false	false	是否檢測語言，默認不檢測。當前支持（中文、英語、日語、韓語）
vertexes_location	否	string	true false	false	是否返回文字外接多邊形頂點位置，不支持單字位置。默認爲false
probability	否	string	true false		是否返回識別結果中每一行的置信度

通用文字識別（含位置信息版）返回數據參數詳情

字段	必選	類型	說明
direction	否	number	圖像方向，當detect_direction=true時存在。 - -1:未定義， - 0:正向， - 1: 逆時針90度， - 2:逆時針180度， - 3:逆時針270度
log_id	是	number	唯一的log id，用於問題定位
words_result	是	array	定位和識別結果數組
words_result_num	是	number	識別結果數，表示words_result的元素個數
+vertexes_location	否	array	當前爲四個頂點: 左上，右上，右下，左下。當vertexes_location=true時存在
++x	是	number	水平座標（座標0點爲左上角）
++y	是	number	垂直座標（座標0點爲左上角）
+location	是	array	位置數組（座標0點爲左上角）
++left	是	number	表示定位位置的長方形左上頂點的水平座標
++top	是	number	表示定位位置的長方形左上頂點的垂直座標
++width	是	number	表示定位位置的長方形的寬度
++height	是	number	表示定位位置的長方形的高度
+words	否	number	識別結果字符串
+chars	否	array	單字符結果，recognize_granularity=small時存在
++location	是	array	位置數組（座標0點爲左上角）
+++left	是	number	表示定位位置的長方形左上頂點的水平座標
+++top	是	number	表示定位位置的長方形左上頂點的垂直座標
+++width	是	number	表示定位定位位置的長方形的寬度
+++height	是	number	表示位置的長方形的高度
++char	是	string	單字符識別結果
probability	否	object	行置信度信息；如果輸入參數 probability = true 則輸出
+ average	否	number	行置信度平均值
+ variance	否	number	行置信度方差
+ min	否	number	行置信度最小值

通用文字識別（含位置信息版）返回示例

{
"log_id": 3523983603,
"direction": 0, //detect_direction=true時存在
"words_result_num": 2,
"words_result": [
    {
        "location": {
            "left": 35,
            "top": 53,
            "width": 193,
            "height": 109
        },
        "words": "感動",
        "chars": [    //recognize_granularity=small時存在
            {
                "location": {
                    "left": 56,
                    "top": 65,
                    "width": 69,
                    "height": 88
                },
                "char": "感"
            },
            {
                "location": {
                    "left": 140,
                    "top": 65,
                    "width": 70,
                    "height": 88
                },
                "char": "動"
            }
        ]
    }
    ...
]
}

通用文字識別（含位置高精度版）

用戶向服務請求識別某張圖中的所有文字，並返回文字在圖片中的座標信息，相對於通用文字識別（含位置信息版）該產品精度更高，但是識別耗時會稍長。

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用通用文字識別（含位置高精度版） """
client.accurate(image);

""" 如果有可選參數 """
options = {}
options["recognize_granularity"] = "big"
options["detect_direction"] = "true"
options["vertexes_location"] = "true"
options["probability"] = "true"

""" 帶參數調用通用文字識別（含位置高精度版） """
client.accurate(image, options)

通用文字識別（含位置高精度版）請求參數詳情

參數名稱	是否必選	類型	可選值範圍	默認值	說明
image	是	string			圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式
recognize_granularity	否	string	big - 不定位單字符位置 small - 定位單字符位置	small	是否定位單字符位置，big：不定位單字符位置，默認值；small：定位單字符位置
detect_direction	否	string	true false	false	是否檢測圖像朝向，默認不檢測，即：false。朝向是指輸入圖像是正常方向、逆時針旋轉90/180/270度。可選值包括: - true：檢測朝向； - false：不檢測朝向。
vertexes_location	否	string	true false	false	是否返回文字外接多邊形頂點位置，不支持單字位置。默認爲false
probability	否	string	true false		是否返回識別結果中每一行的置信度

通用文字識別（含位置高精度版）返回數據參數詳情

字段	必選	類型	說明
direction	否	number	圖像方向，當detect_direction=true時存在。 - -1:未定義， - 0:正向， - 1: 逆時針90度， - 2:逆時針180度， - 3:逆時針270度
log_id	是	number	唯一的log id，用於問題定位
words_result	是	array	定位和識別結果數組
words_result_num	是	number	識別結果數，表示words_result的元素個數
+vertexes_location	否	array	當前爲四個頂點: 左上，右上，右下，左下。當vertexes_location=true時存在
++x	是	number	水平座標（座標0點爲左上角）
++y	是	number	垂直座標（座標0點爲左上角）
+location	是	array	位置數組（座標0點爲左上角）
++left	是	number	表示定位位置的長方形左上頂點的水平座標
++top	是	number	表示定位位置的長方形左上頂點的垂直座標
++width	是	number	表示定位位置的長方形的寬度
++height	是	number	表示定位位置的長方形的高度
+words	否	number	識別結果字符串
+chars	否	array	單字符結果，recognize_granularity=small時存在
++location	是	array	位置數組（座標0點爲左上角）
+++left	是	number	表示定位位置的長方形左上頂點的水平座標
+++top	是	number	表示定位位置的長方形左上頂點的垂直座標
+++width	是	number	表示定位定位位置的長方形的寬度
+++height	是	number	表示位置的長方形的高度
++char	是	string	單字符識別結果
probability	否	object	行置信度信息；如果輸入參數 probability = true 則輸出
+ average	否	number	行置信度平均值
+ variance	否	number	行置信度方差
+ min	否	number	行置信度最小值

通用文字識別（含位置高精度版）返回示例

{
"log_id": 3523983603,
"direction": 0, //detect_direction=true時存在
"words_result_num": 2,
"words_result": [
    {
        "location": {
            "left": 35,
            "top": 53,
            "width": 193,
            "height": 109
        },
        "words": "感動",
        "chars": [    //recognize_granularity=small時存在
            {
                "location": {
                    "left": 56,
                    "top": 65,
                    "width": 69,
                    "height": 88
                },
                "char": "感"
            },
            {
                "location": {
                    "left": 140,
                    "top": 65,
                    "width": 70,
                    "height": 88
                },
                "char": "動"
            }
        ]
    }
    ...
]
}

通用文字識別（含生僻字版）

某些場景中，圖片中的中文不光有常用字，還包含了生僻字，這時用戶需要對該圖進行文字識別，應使用通用文字識別（含生僻字版）。

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用通用文字識別（含生僻字版）, 圖片參數爲本地圖片 """
client.enhancedGeneral(image);

""" 如果有可選參數 """
options = {}
options["language_type"] = "CHN_ENG"
options["detect_direction"] = "true"
options["detect_language"] = "true"
options["probability"] = "true"

""" 帶參數調用通用文字識別（含生僻字版）, 圖片參數爲本地圖片 """
client.enhancedGeneral(image, options)

url = "https//www.x.com/sample.jpg"

""" 調用通用文字識別（含生僻字版）, 圖片參數爲遠程url圖片 """
client.enhancedGeneralUrl(url);

""" 如果有可選參數 """
options = {}
options["language_type"] = "CHN_ENG"
options["detect_direction"] = "true"
options["detect_language"] = "true"
options["probability"] = "true"

""" 帶參數調用通用文字識別（含生僻字版）, 圖片參數爲遠程url圖片 """
client.enhancedGeneralUrl(url, options)

通用文字識別（含生僻字版）請求參數詳情

參數名稱	是否必選	類型	可選值範圍	默認值	說明
image	是	string			圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式
url	是	string			圖片完整URL，URL長度不超過1024字節，URL對應的圖片base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式，當image字段存在時url字段失效
language_type	否	string	CHN_ENG ENG POR FRE GER ITA SPA RUS JAP KOR	CHN_ENG	識別語言類型，默認爲CHN_ENG。可選值包括： - CHN_ENG：中英文混合； - ENG：英文； - POR：葡萄牙語； - FRE：法語； - GER：德語； - ITA：意大利語； - SPA：西班牙語； - RUS：俄語； - JAP：日語； - KOR：韓語；
detect_direction	否	string	true false	false	是否檢測圖像朝向，默認不檢測，即：false。朝向是指輸入圖像是正常方向、逆時針旋轉90/180/270度。可選值包括: - true：檢測朝向； - false：不檢測朝向。
detect_language	否	string	true false	false	是否檢測語言，默認不檢測。當前支持（中文、英語、日語、韓語）
probability	否	string	true false		是否返回識別結果中每一行的置信度

通用文字識別（含生僻字版）返回數據參數詳情

字段	是否必選	類型	說明
direction	否	int32	圖像方向，當detect_direction=true時存在。 - -1:未定義， - 0:正向， - 1: 逆時針90度， - 2:逆時針180度， - 3:逆時針270度
log_id	是	uint64	唯一的log id，用於問題定位
words_result	是	array()	識別結果數組
words_result_num	是	uint32	識別結果數，表示words_result的元素個數
+words	否	string	識別結果字符串
probability	否	object	識別結果中每一行的置信度值，包含average：行置信度平均值，variance：行置信度方差，min：行置信度最小值
+ average	否	number	行置信度平均值
+ variance	否	number	行置信度方差
+ min	否	number	行置信度最小值

通用文字識別（含生僻字版）返回示例

{
"log_id": 2471272194,
"words_result_num": 2,
"words_result":
    [
        {"words": " TSINGTAO"},
        {"words": "青島睥酒"}
    ]
}

網絡圖片文字識別

用戶向服務請求識別一些網絡上背景複雜，特殊字體的文字。

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用網絡圖片文字識別, 圖片參數爲本地圖片 """
client.webImage(image);

""" 如果有可選參數 """
options = {}
options["detect_direction"] = "true"
options["detect_language"] = "true"

""" 帶參數調用網絡圖片文字識別, 圖片參數爲本地圖片 """
client.webImage(image, options)

url = "https//www.x.com/sample.jpg"

""" 調用網絡圖片文字識別, 圖片參數爲遠程url圖片 """
client.webImageUrl(url);

""" 如果有可選參數 """
options = {}
options["detect_direction"] = "true"
options["detect_language"] = "true"

""" 帶參數調用網絡圖片文字識別, 圖片參數爲遠程url圖片 """
client.webImageUrl(url, options)

網絡圖片文字識別請求參數詳情

參數名稱	是否必選	類型	可選值範圍	默認值	說明
image	是	string			圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式
url	是	string			圖片完整URL，URL長度不超過1024字節，URL對應的圖片base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式，當image字段存在時url字段失效
detect_direction	否	string	true false	false	是否檢測圖像朝向，默認不檢測，即：false。朝向是指輸入圖像是正常方向、逆時針旋轉90/180/270度。可選值包括: - true：檢測朝向； - false：不檢測朝向。
detect_language	否	string	true false	false	是否檢測語言，默認不檢測。當前支持（中文、英語、日語、韓語）

網絡圖片文字識別返回數據參數詳情

字段	是否必選	類型	說明
direction	否	number	圖像方向，當detect_direction=true時存在。 - -1:未定義， - 0:正向， - 1: 逆時針90度， - 2:逆時針180度， - 3:逆時針270度
log_id	是	number	唯一的log id，用於問題定位
words_result	是	array()	識別結果數組
words_result_num	是	number	識別結果數，表示words_result的元素個數
+words	否	string	識別結果字符串
probability	否	object	識別結果中每一行的置信度值，包含average：行置信度平均值，variance：行置信度方差，min：行置信度最小值
+ average	否	number	行置信度平均值
+ variance	否	number	行置信度方差
+ min	否	number	行置信度最小值

網絡圖片文字識別返回示例

{
"log_id": 2471272194,
"words_result_num": 2,
"words_result":
    [
        {"words": " TSINGTAO"},
        {"words": "青島睥酒"}
    ]
}

身份證識別

用戶向服務請求識別身份證，身份證識別包括正面和背面。

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')
idCardSide = "back"

""" 調用身份證識別 """
client.idcard(image, idCardSide);

""" 如果有可選參數 """
options = {}
options["detect_direction"] = "true"
options["detect_risk"] = "false"

""" 帶參數調用身份證識別 """
client.idcard(image, idCardSide, options)

身份證識別請求參數詳情

參數名稱	是否必選	類型	可選值範圍	默認值	說明
image	是	string			圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式
id_card_side	是	string	front - 身份證含照片的一面 back - 身份證帶國徽的一面		front：身份證含照片的一面；back：身份證帶國徽的一面
detect_direction	否	string	true false	false	是否檢測圖像朝向，默認不檢測，即：false。朝向是指輸入圖像是正常方向、逆時針旋轉90/180/270度。可選值包括: - true：檢測朝向； - false：不檢測朝向。
detect_risk	否	string	true - 開啓 false - 不開啓		是否開啓身份證風險類型(身份證複印件、臨時身份證、身份證翻拍、修改過的身份證)功能，默認不開啓，即：false。可選值:true-開啓；false-不開啓

身份證識別返回數據參數詳情

字段	是否必選	類型	說明
direction	否	number	圖像方向，當detect_direction=true時存在。 - -1:未定義， - 0:正向， - 1: 逆時針90度， - 2:逆時針180度， - 3:逆時針270度
image_status	是	string	normal-識別正常 reversed_side-未擺正身份證 non_idcard-上傳的圖片中不包含身份證 blurred-身份證模糊 over_exposure-身份證關鍵字段反光或過曝 unknown-未知狀態
risk_type	否	string	輸入參數 detect_risk = true 時，則返回該字段識別身份證類型: normal-正常身份證；copy-複印件；temporary-臨時身份證；screen-翻拍；unknow-其他未知情況
edit_tool	否	string	如果參數 detect_risk = true 時，則返回此字段。如果檢測身份證被編輯過，該字段指定編輯軟件名稱，如:Adobe Photoshop CC 2014 (Macintosh),如果沒有被編輯過則返回值無此參數
log_id	是	number	唯一的log id，用於問題定位
words_result	是	array(object)	定位和識別結果數組
words_result_num	是	number	識別結果數，表示words_result的元素個數
+location	是	array(object)	位置數組（座標0點爲左上角）
++left	是	number	表示定位位置的長方形左上頂點的水平座標
++top	是	number	表示定位位置的長方形左上頂點的垂直座標
++width	是	number	表示定位位置的長方形的寬度
++height	是	number	表示定位位置的長方形的高度
+words	否	string	識別結果字符串

身份證識別返回示例

{
    "log_id": 2648325511,
    "direction": 0,
    "image_status": "normal",
    "idcard_type": "normal",
    "edit_tool": "Adobe Photoshop CS3 Windows",
    "words_result": {
        "住址": {
            "location": {
                "left": 267,
                "top": 453,
                "width": 459,
                "height": 99
            },
            "words": "南京市江寧區弘景大道3889號"
        },
        "公民身份號碼": {
            "location": {
                "left": 443,
                "top": 681,
                "width": 589,
                "height": 45
            },
            "words": "330881199904173914"
        },
        "出生": {
            "location": {
                "left": 270,
                "top": 355,
                "width": 357,
                "height": 45
            },
            "words": "19990417"
        },
        "姓名": {
            "location": {
                "left": 267,
                "top": 176,
                "width": 152,
                "height": 50
            },
            "words": "伍雲龍"
        },
        "性別": {
            "location": {
                "left": 269,
                "top": 262,
                "width": 33,
                "height": 52
            },
            "words": "男"
        },
        "民族": {
            "location": {
                "left": 492,
                "top": 279,
                "width": 30,
                "height": 37
            },
            "words": "漢"
        }
    },
    "words_result_num": 6
}

銀行卡識別

識別銀行卡並返回卡號和髮卡行。

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用銀行卡識別 """
client.bankcard(image);

銀行卡識別請求參數詳情

參數名稱	是否必選	類型	說明
image	是	string	圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式

銀行卡識別返回數據參數詳情

參數	類型	是否必須	說明
log_id	number	是	請求標識碼，隨機數，唯一。
result	object	是	返回結果
+bank_card_number	string	是	銀行卡卡號
+bank_name	string	是	銀行名，不能識別時爲空
+bank_card_type	number	是	銀行卡類型，0:不能識別; 1: 借記卡; 2: 信用卡

銀行卡識別返回示例

{
    "log_id": 1447188951,
    "result": {
        "bank_card_number": "622500000000000",
        "bank_name": "招商銀行",
        "bank_card_type": 1
    }
}

駕駛證識別

對機動車駕駛證所有關鍵字段進行識別

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用駕駛證識別 """
client.drivingLicense(image);

""" 如果有可選參數 """
options = {}
options["detect_direction"] = "true"

""" 帶參數調用駕駛證識別 """
client.drivingLicense(image, options)

駕駛證識別請求參數詳情

參數名稱	是否必選	類型	可選值範圍	默認值	說明
image	是	string			圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式
detect_direction	否	string	true false	false	是否檢測圖像朝向，默認不檢測，即：false。朝向是指輸入圖像是正常方向、逆時針旋轉90/180/270度。可選值包括: - true：檢測朝向； - false：不檢測朝向。

駕駛證識別返回數據參數詳情

字段	是否必選	類型	說明
log_id	是	number	唯一的log id，用於問題定位
words_result_num	是	number	識別結果數，表示words_result的元素個數
words_result	是	array(object)	識別結果數組
+words	否	string	識別結果字符串

駕駛證識別返回示例

{
  "errno": 0,
  "msg": "success",
  "data": {
    "words_result_num": 10,
    "words_result": {
      "證號": {
        "words": "3208231999053090"
      },
      "有效期限": {
        "words": "6年"
      },
      "準駕車型": {
        "words": "B2"
      },
      "有效起始日期": {
        "words": "20101125"
      },
      "住址": {
        "words": "江蘇省南通市海門鎮秀山新城"
      },
      "姓名": {
        "words": "小歐歐"
      },
      "國籍": {
        "words": "中國"
      },
      "出生日期": {
        "words": "19990530"
      },
      "性別": {
        "words": "男"
      },
      "初次領證日期": {
        "words": "20100125"
      }
    }
  }
}

行駛證識別

對機動車行駛證正本所有關鍵字段進行識別

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用行駛證識別 """
client.vehicleLicense(image);

""" 如果有可選參數 """
options = {}
options["detect_direction"] = "true"
options["accuracy"] = "normal"

""" 帶參數調用行駛證識別 """
client.vehicleLicense(image, options)

行駛證識別請求參數詳情

參數名稱	是否必選	類型	可選值範圍	默認值	說明
image	是	string			圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式
detect_direction	否	string	true false	false	是否檢測圖像朝向，默認不檢測，即：false。朝向是指輸入圖像是正常方向、逆時針旋轉90/180/270度。可選值包括: - true：檢測朝向； - false：不檢測朝向。
vehicle_license_side	否	string	front/back	front	- front：識別行駛證主頁 - back：識別行駛證副頁
unified	否	string	true/false	false	- false：不進行歸一化處理 - true：對輸出字段進行歸一化處理，將新/老版行駛證的“註冊登記日期/註冊日期”統一爲”註冊日期“進行輸出

行駛證識別返回數據參數詳情

字段	必選	類型	說明
log_id	是	number	唯一的log id，用於問題定位
words_result_num	是	number	識別結果數，表示words_result的元素個數
words_result	是	array(object)	識別結果數組
+words	否	string	識別結果字符串

行駛證識別返回示例

{
  "errno": 0,
  "msg": "success",
  "data": {
    "words_result_num": 10,
    "words_result": {
      "品牌型號": {
        "words": "保時捷GT37182RUCRE"
      },
      "發證日期": {
        "words": "20160104"
      },
      "使用性質": {
        "words": "非營運"
      },
      "發動機號碼": {
        "words": "20832"
      },
      "號牌號碼": {
        "words": "蘇A001"
      },
      "所有人": {
        "words": "圓圓"
      },
      "住址": {
        "words": "南京市江寧區弘景大道"
      },
      "註冊日期": {
        "words": "20160104"
      },
      "車輛識別代號": {
        "words": "HCE58"
      },
      "車輛類型": {
        "words": "小型轎車"
      }
    }
  }
}

車牌識別

識別機動車車牌，並返回簽發地和號牌。

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用車牌識別 """
client.licensePlate(image);

""" 如果有可選參數 """
options = {}
options["multi_detect"] = "true"

""" 帶參數調用車牌識別 """
client.licensePlate(image, options)

車牌識別請求參數詳情

參數名稱	是否必選	類型	可選值範圍	默認值	說明
image	是	string			圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式
multi_detect	否	string	true false	false	是否檢測多張車牌，默認爲false，當置爲true的時候可以對一張圖片內的多張車牌進行識別

車牌識別返回數據參數詳情

參數	類型	是否必須	說明
log_id	uint64	是	請求標識碼，隨機數，唯一。
Color	string	是	車牌顏色
number	string	是	車牌號碼

車牌識別返回示例

{
    "log_id": 3583925545,
    "words_result": {
        "color": "blue",
        "number": "蘇HS7766"
    }
}

營業執照識別

識別營業執照，並返回關鍵字段的值，包括單位名稱、法人、地址、有效期、證件編號、社會信用代碼等。

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用營業執照識別 """
client.businessLicense(image);

營業執照識別請求參數詳情

參數名稱	是否必選	類型	說明
image	是	string	圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式

營業執照識別返回數據參數詳情

參數	是否必須	類型	說明
log_id	是	number	請求標識碼，隨機數，唯一。
words_result_num	是	number	識別結果數，表示words_result的元素個數
words_result	是	array(object)	識別結果數組
left	是	number	表示定位位置的長方形左上頂點的水平座標
top	是	number	表示定位位置的長方形左上頂點的垂直座標
width	是	number	表示定位位置的長方形的寬度
height	是	number	表示定位位置的長方形的高度
words	否	string	識別結果字符串

營業執照識別返回示例

{
    "log_id": 490058765,
    "words_result": {
        "單位名稱": {
            "location": {
                "left": 500,
                "top": 479,
                "width": 618,
                "height": 54
            },
            "words": "袁氏財團有限公司"
        },
        "法人": {
            "location": {
                "left": 938,
                "top": 557,
                "width": 94,
                "height": 46
            },
            "words": "袁運籌"
        },
        "地址": {
            "location": {
                "left": 503,
                "top": 644,
                "width": 574,
                "height": 57
            },
            "words": "江蘇省南京市中山東路19號"
        },
        "有效期": {
            "location": {
                "left": 779,
                "top": 1108,
                "width": 271,
                "height": 49
            },
            "words": "2015年02月12日"
        },
        "證件編號": {
            "location": {
                "left": 1219,
                "top": 357,
                "width": 466,
                "height": 39
            },
            "words": "蘇餐證字(2019)第666602666661號"
        },
        "社會信用代碼": {
            "location": {
                "left": 0,
                "top": 0,
                "width": 0,
                "height": 0
            },
            "words": "無"
        }
    },
    "words_result_num": 6
}

通用票據識別

用戶向服務請求識別醫療票據、發票、的士票、保險保單等票據類圖片中的所有文字，並返回文字在圖中的位置信息。

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用通用票據識別 """
client.receipt(image);

""" 如果有可選參數 """
options = {}
options["recognize_granularity"] = "big"
options["probability"] = "true"
options["accuracy"] = "normal"
options["detect_direction"] = "true"

""" 帶參數調用通用票據識別 """
client.receipt(image, options)

通用票據識別請求參數詳情

參數名稱	是否必選	類型	可選值範圍	默認值	說明
image	是	string			圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式
recognize_granularity	否	string	big - 不定位單字符位置 small - 定位單字符位置	small	是否定位單字符位置，big：不定位單字符位置，默認值；small：定位單字符位置
probability	否	string	true false		是否返回識別結果中每一行的置信度
accuracy	否	string	normal - 使用快速服務		normal 使用快速服務，1200ms左右時延；缺省或其它值使用高精度服務，1600ms左右時延
detect_direction	否	string	true false	false	是否檢測圖像朝向，默認不檢測，即：false。朝向是指輸入圖像是正常方向、逆時針旋轉90/180/270度。可選值包括: - true：檢測朝向； - false：不檢測朝向。

通用票據識別返回數據參數詳情

字段	是否必選	類型	說明
log_id	是	number	唯一的log id，用於問題定位
words_result_num	是	number	識別結果數，表示words_result的元素個數
words_result	是	array()	定位和識別結果數組
location	是	object	位置數組（座標0點爲左上角）
left	是	number	表示定位位置的長方形左上頂點的水平座標
top	是	number	表示定位位置的長方形左上頂點的垂直座標
width	是	number	表示定位位置的長方形的寬度
height	是	number	表示定位位置的長方形的高度
words	是	string	識別結果字符串
chars	否	array()	單字符結果，recognize_granularity=small時存在
location	是	array()	位置數組（座標0點爲左上角）
left	是	number	表示定位位置的長方形左上頂點的水平座標
top	是	number	表示定位位置的長方形左上頂點的垂直座標
width	是	number	表示定位定位位置的長方形的寬度
height	是	number	表示位置的長方形的高度
char	是	string	單字符識別結果
probability	否	object	識別結果中每一行的置信度值，包含average：行置信度平均值，variance：行置信度方差，min：行置信度最小值

通用票據識別返回示例

{
    "log_id": 2661573626,
    "words_result": [
        {
            "location": {
                "left": 10,
                "top": 3,
                "width": 121,
                "height": 24
            },
            "words": "姓名:小明明",
            "chars": [
                {
                    "location": {
                        "left": 16,
                        "top": 6,
                        "width": 17,
                        "height": 20
                    },
                    "char": "姓"
                }
                ...
            ]
        },
        {
            "location": {
                "left": 212,
                "top": 3,
                "width": 738,
                "height": 24
            },
            "words": "卡號/病案號:105353990標本編號:150139071送檢科室:血液透析門診病房",
            "chars": [
                {
                    "location": {
                        "left": 218,
                        "top": 6,
                        "width": 18,
                        "height": 21
                    },
                    "char": "卡"
                }
                ...
            ]
        }
    ],
    "words_result_num": 2
}

自定義模板文字識別

自定義模板文字識別，是針對百度官方沒有推出相應的模板，但是當用戶需要對某一類卡證/票據（如房產證、軍官證、火車票等）進行結構化的提取內容時，可以使用該產品快速製作模板，進行識別。

from aip import AipOcr

APP_ID = ''
API_KEY = ''
SECRET_KEY = ''
client = AipOcr(APP_ID, API_KEY, SECRET_KEY)

# 讀取圖片
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()
image = get_file_content('aa.jpg')

//額外的參數
options = {}
//key固定爲templateSign 後面給頁面提供的 模板ID（templateSign） 的值即可
options["templateSign"] = ""
# 調用自定義模板文字識別
result = client.custom(image, options);
print(result)

自定義模板文字識別請求參數詳情

參數名稱	是否必選	類型	說明
image	是	string	圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式
options	是	object	用於傳入額外參數，如templateSign、classifierId
+ templateSign	否	string	您在自定義文字識別平臺製作的模板的ID
+ classifierId	否	string	分類器Id。這個參數和templateSign至少存在一個，優先使用templateSign。存在templateSign時，表示使用指定模板；如果沒有templateSign而有classifierId，表示使用分類器去判斷使用哪個模板

自定義模板文字識別返回數據參數詳情

字段	是否必選	類型	說明
error_code	number	number	0代表成功，如果有錯誤碼返回可以參考下方錯誤碼列表排查問題
error_msg	是	string	具體的失敗信息，可以參考下方錯誤碼列表排查問題
data	jsonObject	識別返回的結果

自定義模板文字識別返回示例

{
    "isStructured": true,
    "ret": [
        {
            "charset": [
                {
                    "rect": {
                        "top": 183,
                        "left": 72,
                        "width": 14,
                        "height": 28
                    },
                    "word": "5"
                },
                {
                    "rect": {
                        "top": 183,
                        "left": 90,
                        "width": 14,
                        "height": 28
                    },
                    "word": "4"
                },
                {
                    "rect": {
                        "top": 183,
                        "left": 103,
                        "width": 15,
                        "height": 28
                    },
                    "word": "."
                },
                {
                    "rect": {
                        "top": 183,
                        "left": 116,
                        "width": 14,
                        "height": 28
                    },
                    "word": "5"
                },
                {
                    "rect": {
                        "top": 183,
                        "left": 133,
                        "width": 19,
                        "height": 28
                    },
                    "word": "元"
                }
            ],
            "word_name": "票價",
            "word": "54.5元"
        },
        {
            "charset": [
                {
                    "rect": {
                        "top": 144,
                        "left": 35,
                        "width": 14,
                        "height": 28
                    },
                    "word": "2"
                },
                {
                    "rect": {
                        "top": 144,
                        "left": 53,
                        "width": 14,
                        "height": 28
                    },
                    "word": "0"
                },
                {
                    "rect": {
                        "top": 144,
                        "left": 79,
                        "width": 14,
                        "height": 28
                    },
                    "word": "1"
                },
                {
                    "rect": {
                        "top": 144,
                        "left": 97,
                        "width": 14,
                        "height": 28
                    },
                    "word": "7"
                }
            ]
    ]
}

表格文字識別同步接口

自動識別表格線及表格內容，結構化輸出表頭、表尾及每個單元格的文字內容。

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用表格文字識別同步接口 """
client.form(image);

表格文字識別同步接口請求參數詳情

參數名稱	是否必選	類型	說明
image	是	string	圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式

表格文字識別同步接口返回數據參數詳情

字段	是否必選	類型	說明
log_id	是	long	唯一的log id，用於問題定位
forms_result_num	是	number
forms_result	是	array(object)	識別結果

表格文字識別同步接口返回示例

{
    "log_id": 3445697108,
    "forms_result_num": 1,
    "forms_result": [
        {
            "body": [
                {
                    "column": 0,
                    "probability": 0.99855202436447,
                    "row": 0,
                    "vertexes_location": [
                        {
                            "x": -2,
                            "y": 260
                        },
                        {
                            "x": 21,
                            "y": 244
                        },
                        {
                            "x": 35,
                            "y": 266
                        },
                        {
                            "x": 12,
                            "y": 282
                        }
                    ],
                    "words": "目"
                },
                {
                    "column": 3,
                    "probability": 0.99960500001907,
                    "row": 5,
                    "vertexes_location": [
                        {
                            "x": 603,
                            "y": 52
                        },
                        {
                            "x": 634,
                            "y": 32
                        },
                        {
                            "x": 646,
                            "y": 50
                        },
                        {
                            "x": 615,
                            "y": 71
                        }
                    ],
                    "words": "66"
                },
                {
                    "column": 3,
                    "probability": 0.99756097793579,
                    "row": 6,
                    "vertexes_location": [
                        {
                            "x": 634,
                            "y": 73
                        },
                        {
                            "x": 648,
                            "y": 63
                        },
                        {
                            "x": 657,
                            "y": 77
                        },
                        {
                            "x": 643,
                            "y": 86
                        }
                    ],
                    "words": "4"
                },
                {
                    "column": 3,
                    "probability": 0.96489900350571,
                    "row": 10,
                    "vertexes_location": [
                        {
                            "x": 699,
                            "y": 178
                        },
                        {
                            "x": 717,
                            "y": 167
                        },
                        {
                            "x": 727,
                            "y": 183
                        },
                        {
                            "x": 710,
                            "y": 194
                        }
                    ],
                    "words": "3,"
                },
                {
                    "column": 3,
                    "probability": 0.99809801578522,
                    "row": 14,
                    "vertexes_location": [
                        {
                            "x": 751,
                            "y": 296
                        },
                        {
                            "x": 786,
                            "y": 273
                        },
                        {
                            "x": 797,
                            "y": 289
                        },
                        {
                            "x": 761,
                            "y": 312
                        }
                    ],
                    "words": "206"
                }
            ],
            "footer": [
                {
                    "column": 0,
                    "probability": 0.99853301048279,
                    "row": 0,
                    "vertexes_location": [
                        {
                            "x": 605,
                            "y": 698
                        },
                        {
                            "x": 632,
                            "y": 680
                        },
                        {
                            "x": 643,
                            "y": 696
                        },
                        {
                            "x": 616,
                            "y": 714
                        }
                    ],
                    "words": "22"
                }
            ],
            "header": [
                {
                    "column": 0,
                    "probability": 0.94802802801132,
                    "row": 0,
                    "vertexes_location": [
                        {
                            "x": 183,
                            "y": 96
                        },
                        {
                            "x": 286,
                            "y": 29
                        },
                        {
                            "x": 301,
                            "y": 52
                        },
                        {
                            "x": 199,
                            "y": 120
                        }
                    ],
                    "words": "29月"
                }
            ],
            "vertexes_location": [
                {
                    "x": -154,
                    "y": 286
                },
                {
                    "x": 512,
                    "y": -153
                },
                {
                    "x": 953,
                    "y": 513
                },
                {
                    "x": 286,
                    "y": 953
                }
            ]
        }
    ]
}

表格文字識別

自動識別表格線及表格內容，結構化輸出表頭、表尾及每個單元格的文字內容。表格文字識別接口爲異步接口，分爲兩個API：提交請求接口、獲取結果接口。

""" 讀取圖片 """
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

image = get_file_content('example.jpg')

""" 調用表格文字識別 """
client.tableRecognitionAsync(image);

表格文字識別請求參數詳情

參數名稱	是否必選	類型	說明
image	是	string	圖像數據，base64編碼，要求base64編碼後大小不超過4M，最短邊至少15px，最長邊最大4096px,支持jpg/png/bmp格式

表格文字識別返回數據參數詳情

字段	是否必選	類型	說明
log_id	是	long	唯一的log id，用於問題定位
result	是	list	返回的結果列表
+request_id	是	string	該請求生成的request_id，後續使用該request_id獲取識別結果

表格文字識別返回示例

{
    "result" : [
        {
            "request_id" : "1234_6789"
        }
    ],
    "log_id":149689853984104
}

失敗應答示例（詳細的錯誤碼說明見本文檔底部）：

{
    "log_id": 149319909347709,
    "error_code": 282000
    "error_msg":"internal error"
}

表格識別結果

獲取表格文字識別結果

requestId = "23454320-23255"

""" 調用表格識別結果 """
client.getTableRecognitionResult(requestId);

""" 如果有可選參數 """
options = {}
options["result_type"] = "json"

""" 帶參數調用表格識別結果 """
client.getTableRecognitionResult(requestId, options)

表格識別結果請求參數詳情

參數名稱	是否必選	類型	可選值範圍	默認值	說明
request_id	是	string			發送表格文字識別請求時返回的request id
result_type	否	string	json excel	excel	期望獲取結果的類型，取值爲“excel”時返回xls文件的地址，取值爲“json”時返回json格式的字符串,默認爲”excel”

表格識別結果返回數據參數詳情

字段	是否必選	類型	說明
log_id	是	long	唯一的log id，用於問題定位
result	是	object	返回的結果
+result_data	是	string	識別結果字符串，如果request_type是excel，則返回excel的文件下載地址，如果request_type是json，則返回json格式的字符串
+percent	是	int	表格識別進度（百分比）
+request_id	是	string	該圖片對應請求的request_id
+ret_code	是	int	識別狀態，1：任務未開始，2：進行中,3:已完成
+ret_msg	是	string	識別狀態信息，任務未開始，進行中,已完成

表格識別結果返回示例

成功應答示例：

{
    "result" : {
        "result_data" : "",
        "persent":100,
        "request_id": "149691317905102",
        "ret_code": 3
        "ret_msg": "已完成",
    },
    "log_id":149689853984104
}

當request_type爲excel時，result_data格式樣例爲：

{
    "file_url":"https://ai.baidu.com/file/xxxfffddd"
}

當request_type爲json時，result_data格式樣例爲：

{
    "form_num": 1,
    "forms": [
        {
            "header": [
                {
                "row": [
                    1
                ],
                "column": [
                    1,
                    2
                ],
                "word": "表頭信息1",
            }
        ],
        "footer": [
            {
                "row": [
                    1
                ],
                "column": [
                    1,
                    2
                ],
                "word": "表尾信息1",
            }
        ],
        "body": [
            {
                "row": [
                    1
                ],
                "column": [
                    1,
                    2
                ],
                "word": "單元格文字",
            }
        ]
    }
]
}

其中各個參數的說明(json方式返回結果時)：

字段	是否必選	類型	說明
form_num	是	int	表格數量（可能一張圖片中包含多個表格）
forms	是	list	表格內容信息的列表
+header	是	list	每個表格中，表頭數據的相關信息
+footer	是	list	表尾的相關信息
+body	是	list	表格主體部分的數據
++row	是	list	該單元格佔據的行號
++column	是	list	該單元格佔據的列號
++word	是	string	該單元格中的文字信息

失敗應答示例（詳細的錯誤碼說明見本文檔底部）：

{
    "log_id": 149319909347709,
    "error_code": 282000
    "error_msg":"internal error"
}

表格識別接口

代碼示例

調用表格識別請求，獲取請求id之後輪詢調用表格識別獲取結果的接口

""" 同步獲取表格識別 返回識別結果 """
client.tableRecognition(
    get_file_content('table.jpg'),
    {
        'result_type': 'json',
    },
)

請求參數

tableRecognition(image, option, timeout)

參數名稱	是否必選	類型	默認值	說明
image	是	string		圖片base64編碼數據
+result_type	是	string		json excel	excel	期望獲取結果的類型，取值爲“excel”時返回xls文件的地址，取值爲“json”時返回json格式的字符串,默認爲”excel”
timeout	是	number	10000	輪詢tableGetresult接口獲取數據的超時時間，單位毫秒

返回參數

與表格識別結果接口返回相同

百度OCR（python）接口說明