UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 9: ordinal not in range(128)

原創

2020-06-16 16:51

# coding=utf-8
import time
from selenium import webdriver

class GetSubString(object):
    def get_search_result(self):
        driver = webdriver.Ie()
        driver.maximize_window()
        driver.implicitly_wait(8)

        driver.get('http://www.baidu.com')
        driver.find_element_by_id('kw').send_keys('selenium')
        time.sleep(1)
        search_result_string = driver.find_element_by_xpath("//*/div[@class='nums']").text
        print(search_result_string)

        new_string = search_result_string.split('約')[1]
        print(new_string)

        last_string = new_string.split('個')[0]
        print(last_string)

getstring = GetSubString()

getstring.get_search_result()

上面的程序在編譯後報錯：UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 9: ordinal not in range(128)

將：

new_string = search_result_string.split('約')[1]

last_string = new_string.split('個')[0]

改成：

new_string = search_result_string.split(u'約')[1]

last_string = new_string.split(u'個')[0]

後再編譯就沒問題了。

漢字和特殊符號屬於Unicode編碼，在python環境中代碼本身是用Ascii進行解碼的，代碼中有Ascii無法解碼的內容要告知python怎麼解碼，

在上面的代碼中，最上面一行加入了# coding=utf-8，那麼當前的編碼方式就變成了utf-8,同樣也無法解碼Unicode編碼，所以在前面加上

u是告知系統此內容要以Unicode的方式解碼。

以上內容只是我自己的理解，在解決問題的此件參考的百度資源鏈接在下方附上，可自行參考。

Unicode編碼——中文和特殊字符的範圍：http://blog.csdn.net/leopadus/article/details/68961344

類似問題的解決鏈接：https://segmentfault.com/q/1010000005941382?_ea=958769

Python字符編碼：http://python.jobbole.com/85482/

人機交互之字符編碼：http://selfboot.cn/2014/08/28/character_encoding/

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 9: ordinal not in range(128)

'utf8' codec can't decode byte 0xd0 in position 0：unexpected end byte

函數重載、覆蓋、隱藏（重置）

用函數指針表實現多態

說說近期的感想

內嵌子對象

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結