最近通過airtest進行自動化測試時，發現有些控件上面的數值不能通過poco裏面的get_text()，或者attr(*args, **kwargs)方法獲取到值，比如下圖中顯示100%這個控件的類型爲android.view.View ，是繪製的圖片。

要獲取圖片中的文字，自然想到了文字識別方法。

一、通過`snapshot進行截圖`

注意截圖返回的類型爲2-tuple，第一個爲base64編碼的截圖數據，第二個參數爲類型。

截圖時的width可以通過airtest的get_current_resolution()獲取當前設備的屏幕分辨率。

snapshot(width=720)[源代碼]

Take the screenshot from the target device. The supported output format (png, jpg, etc.) depends on the agent implementation.

參數:

參數:	width (`int`) – an expected width of the screenshot. The real size depends on the agent implementation might not be possible to obtain the expected width of the screenshot (and) –
返回:	screen_shot (`str/bytes`): base64 encoded screenshot data format (`str`): output format ‘png’, ‘jpg’, etc.
返回類型:	2-tuple

width (int) – an expected width of the screenshot. The real size depends on the agent implementation
might not be possible to obtain the expected width of the screenshot (and) –

screen_shot (str/bytes): base64 encoded screenshot data
format (str): output format ‘png’, ‘jpg’, etc.

返回類型:

2-tuple

(width, height) = device.get_current_resolution()   #獲取屏幕寬度、高度
(image_data, type) = poco_new.snapshot(width)   #屏幕截取
image = base64.b64decode(image_data)       #解碼截屏數據

with open('1.jpg', 'wb') as f:      #保存截屏爲文件
    f.write(image)
img = Image.open("1.jpg")

https://blog.csdn.net/wang785994599/article/details/96425280
經查找資料（https://stackoverflow.com/questions/8328198/pil-valueerror-not-enough-image-data）
得知該圖片爲jpg格式，包括了圖片的原始（jpg壓縮後的）數據和（jpg）文件頭，而frombytes只能讀取純二進制數據，解決方法如下：

(image_data, type) = poco_new.snapshot(width)   #屏幕截取
image = base64.b64decode(image_data)
img = Image.open(BytesIO(image))

二、對圖片進行截取

Image.crop(box=None)[source]

Returns a rectangular region from this image. The box is a 4-tuple defining the left, upper, right, and lower pixel coordinate.

This is a lazy operation. Changes to the source image may or may not be reflected in the cropped image. To break the connection, call the load() method on the cropped copy.

參數:	box – The crop rectangle, as a (left, upper, right, lower)-tuple.
返回類型:	Image
返回:	An Image object.

使用Image.crop對圖片進行矩形截取，截取參數爲矩形左上角和右下角的座標。

如下圖所示，整個屏幕的寬度和高度通過(width, height) = device.get_current_resolution()得到。

需要截取矩形框的x、y座標信息通過attr('pos')獲得，由於得到的信息是指在屏幕上的相對位置，還要分別乘以寬度和高度得到像素信息；矩形框的寬度和高度通過attr('size')獲得，也需要分別乘以寬度和高度得到像素信息；

然後計算出矩形框的左上角、右下角座標，單位爲像素。

img = Image.open("1.jpg")

(width, height) = device.get_current_resolution()
(pos_x,pos_y) = content.offspring("xxxxxx").attr('pos')

(size_x, size_y) = poco.offspring("xxxxxx").attr('size')
(size_x, size_y) = (size_x* width, size_y* height)

image_crop = img.crop((pos_x*width - size_x/2,
                          pos_y*height - size_y/2,
          pos_x * width + size_x / 2,
          pos_y * height + size_y / 2))

三、對圖片進行識別

上面是截取得到的數字，可以看到繪圖步驟是根據輸入的數字，在一個固定高度和寬度的畫布上繪製，從左邊開始繪製的顏色塊顏色根據數值大小顏色不同，顏色塊寬度也根據數值大小而變化。在繪製完顏色塊後，在圖片正中加上百分比數字。

可以看到加的數字與畫布底色非常接近，顏色塊的顏色和寬度還根據數值變化，這就給數值識別帶來了很大的難度。

如果能直接訪問數字就好了，可惜poco不能訪問到元素的這個屬性。爲了嘗試識別圖片中的數字，先把57%這個圖片保存到了本地，然後嘗試用不同的方法進行識別。

1、直接用tesserocr進行識別

可以用tesserocr或者pytesseract調用tesseract進行OCR（光學字符識別）

識別結果爲“7a:”，不僅百分號沒有識別正確，5也沒有識別出來。

https://www.cnblogs.com/zhangxinqi/p/9297292.html

text = tesserocr.file_to_text('tmp.jpg')
print(text)
text = pytesseract.image_to_string(Image.open('tmp.jpg'))
print(text)

2、通過python改變顏色

改變顏色的方法參考下面的文字，文中還提到灰度化和二值化。

這裏手動進行了二值化，在RGB值之和大於550時判斷爲黑色，否則爲白色。

灰度化：讓像素點矩陣中的每一個像素點都滿足下面的關係：R=G=B（就是紅色變量的值，綠色變量的值，和藍色變量的值，這三個值相等），此時的這個值叫做灰度值。

二值化：讓圖像的像素點矩陣中的每個像素點的灰度值爲0（黑色）或者255（白色），也就是讓整個圖像呈現只有黑和白的效果。在灰度化的圖像中灰度值的範圍爲0~255，在二值化後的圖像中的灰度值範圍是0或者255。

https://www.jb51.net/article/165410.htm

https://blog.csdn.net/weixin_42170439/article/details/92648390

img = Image.open('tmp.jpg')
array = np.array(img)
for row in range(len(array)):
    for col in range(len(array[0])):
        total = int(array[row, col][0]) + int(array[row,col][1]) + int(array[row,col][2])
        if (total > 550) :
            array[row, col] = [0, 0, 0]
        else:
            array[row, col] = [255, 255, 255]

image = Image.fromarray(array)
image.save('black.jpg')

text = pytesseract.image_to_string(Image.open('black.jpg'),'eng')
print(text)

開始以爲是圖片的對比度不夠，綠底白字不好分辨，爲了改善識別將圖片白色部分找出來變成黑色，其餘部分變成白色。這樣圖片就是黑色和白色的了，如果能夠識別，再將畫布底色也改爲白色。

通過RGB顏色分出來的圖片如下圖，可以看到57都變成了白底黑字，如果能夠識別再將畫布部分特殊處理就可以整個數字變爲白底黑字。可惜處理後的圖片仍然識別成了‘7a’，後面就沒有再嘗試將畫布變爲白底了。

3、通過HSV識別顏色

下面的文章通過HSV轉換後識別圖片顏色獲取圖片輪廓

https://blog.csdn.net/qq_41895190/article/details/82791426

https://blog.csdn.net/a19990412/article/details/81172426

https://blog.csdn.net/hjxu2016/article/details/77833336

#二值化，不屬於color_dict[d][0]~color_dict[d][1]範圍的像素變爲黑色。屬於color_dict[d][0]~color_dict[d][1]範圍的像素變爲白色。
mask = cv2.inRange(hsv,color_dict[d][0],color_dict[d][1])   

cv2.imwrite(d+'.jpg',mask)   #保存圖片

#將一個灰色的圖片，變成要麼是白色要麼就是黑色。大於規定thresh（127）值就是設置的最大值（255，也就是白色），小於thresh（127）的像素設置爲黑色。由於前面已經通過cv2.inRange將圖片變爲了黑白，所以這裏處理後和處理前是一樣的。
binary = cv2.threshold(mask, 127, 255, cv2.THRESH_BINARY)[1]  #

binary = cv2.dilate(binary,None,iterations=2)

#找圖片輪廓，cv2.findContours()函數接受的參數爲二值圖，即黑白的（不是灰度圖）。
img, cnts, hiera = v2.findContours(binary.copy(),cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)

既然能夠獲取圖片中的輪廓，則也可以試試獲取圖片中的數字，根據前面介紹的白色HSV範圍是[0,0,221]~[180,30,255],轉換的時候發現轉換後的圖片是全黑的，因此放寬了白色的HSV範圍到[0,0,180]~[180,90,255]，轉換後的圖片基本與直接改變顏色得到的圖片一致，識別出來的文字還是‘7a’，看來問題出在文字識別上面，不可能這麼簡單的圖片識別不出數字。


frame = cv2.imread('tmp.jpg')
hsv = cv2.cvtColor(frame, cv2.COLOR_RGB2HSV)

mask = cv2.inRange(hsv, np.array([0,0,180]), np.array([180,90,255]))
cv2.imwrite('black' + '.jpg', mask)

text = pytesseract.image_to_string(mask) #Image.open('white.jpg'))
print(text)

binary = cv2.threshold(mask, 127, 255, cv2.THRESH_BINARY)[1]
cv2.imwrite('black' + '.jpg', binary)

text = pytesseract.image_to_string(binary)  # Image.open('white.jpg'))
print(text)

4、通過畫布顯示寬度得到大致數字

前面通過OCR識別圖片中數字的方法失敗了，後來又想到既然顏色塊的大小就反映了數字的大小，那可以通過剩餘畫布所佔的百分比反推出顏色塊的大小。

通過photoshop獲取畫布底色爲（241,241,241），由於顏色塊和畫布交界部分畫布顏色有漸變部分，所以將畫布顏色放寬到（220,220,220），取5~15行的像素進行分析，如果RGB值均大於門限則判斷爲畫布，最後得出顏色塊比率爲58.6，與數值57比較接近。可以通過該方法定量的判斷出數字的大小。

total = 0
white = 0
img = Image.open('tmp.jpg')
array = np.array(img)
for row in range(5, 15):
    for col in range(len(array[0])):
        total = total + 1
        if (array[row, col][0] > 220) and (array[row, col][1] > 220) and (array[row, col][2] > 220):
            white = white + 1
rate = 100 - white * 100 / total

四、提升執行速度

1、freeze

根據博文介紹，freeze()得到的是當前poco實例的一個靜態副本，用freeze可以加快查找速度

https://blog.csdn.net/saint_228/article/details/89638300

freeze()[源代碼]

Snapshot current hierarchy and cache it into a new poco instance. This new poco instance is a copy from current poco instance (self). The hierarchy of the new poco instance is fixed and immutable. It will be super fast when calling dump function from frozen poco. See the example below.

有一點要注意的是，freeze只是將當前結構的靜態副本保存，如果通過freeze.snapshot截圖，則截的圖是snapshot時的時間，而不是freeze時的截圖。

frozen_poco = poco.freeze()
frozen_poco.snapshot(width)

2、將數據先獲取，再集中處理

本來是獲取一次數據處理一次，後來改爲將數據先存入列表再集中處理，加快了處理速度。

3、儘量少截圖

用poco.snapshot截圖一次大概要0.2秒，如果多次截圖時間肯定會變長。

Poco不能獲取文本時處理

一、通過`snapshot進行截圖`

二、對圖片進行截取

三、對圖片進行識別

1、直接用tesserocr進行識別

2、通過python改變顏色

3、通過HSV識別顏色

4、通過畫布顯示寬度得到大致數字

四、提升執行速度

1、freeze

2、將數據先獲取，再集中處理

3、儘量少截圖

5G PDCCH

【Pyqt5】QTableView添加複選框的一種方法

5G時頻資源

Poco不能獲取文本時處理

Python處理QXDM抓取log

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

Poco不能獲取文本時處理

一、通過snapshot進行截圖

二、對圖片進行截取

三、對圖片進行識別

1、直接用tesserocr進行識別

2、通過python改變顏色

3、通過HSV識別顏色

4、通過畫布顯示寬度得到大致數字

四、提升執行速度

1、freeze

2、將數據先獲取，再集中處理

3、儘量少截圖

一、通過`snapshot進行截圖`