python爬蟲時 AttributeError: 'NoneType' object has no attribute 'children'錯誤提示

原創

2020-02-26 05:41

這段時間突然對網絡爬蟲比較感興趣，於是入手了一下。今天看到一個大學排名的網站，想將網頁上的排名信息爬取下來。代碼如下：
import requests
import bs4
from bs4 import BeautifulSoup
import re
def getHTMLText(url):
try:
r = requests.get(url,timrout = 30)
r.raise_for_status()
r.encoding = r.apparent_encoding
return r.text
except:
return ""
def fillUnivList(ulist, html):
soup = BeautifulSoup(html, "html.parser")
tbody = soup.find('tbody')
for tr in soup.find('tbody').children:
if isinstance(tr, bs4.element.Tag):
tds = tr('td')
ulist.append([tds[0].string,tds[1].string,tds[2].string])
def printUnivList(ulist, num):
print("{:^10}\t{:^10}\t{:^10}".format("排名","大學名稱","總分"))
for i in range(num):
u = ulist[i]
print("{:^10}\t{:^10}\t{:^10}".format(u[0],u[1],u[2]))
def main():
#num = int(raw_input("請輸入您要查詢的大學數："))
unifo = []
url = "http://www.gaokaopai.com/paihang-otype-2.html?f=1&ly=bd&city=&cate=&batch_type="
html = getHTMLText(url)
print(html)
fillUnivList(unifo, html)
printUnivList(unifo,10)
main()

運行一下，提示錯誤：

for tr in soup.find('tbody').children:
AttributeError: 'NoneType' object has no attribute 'children'
調試了一天，也上網搜了很多，但怎麼也沒有解決，下午打完球回來一看，突然發現自己好傻，原來是getHTMLText（）函數中的
r = requests.get(url,timrout = 30)
這一句中的timeout打錯字了，不過爲什麼呢？苦惱的我開始繼續百度，終於功夫不負有心人，找到了答案：
首先，解析一下這個錯誤提示，大體的意思就是soup.find()這個函數返回的對象沒有children這個屬性；
這就很好辦了，問題已經確定，就在soup.find()中，於是上網搜了一下find（）函數，發現其返回的對象又children這個屬性，應該沒錯，又想到是不是查找的‘tbody’沒有子類，
查看源代碼後發現有，頓時懵了，還有其他的原因嗎，想了一個小時，突然想到會不會連最起碼的網頁內容都沒獲取到，於是先嚐試輸出網頁源代碼，果然，沒有，於是去看getHTMLText函數，發現了錯誤所在

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

python爬蟲時 AttributeError: 'NoneType' object has no attribute 'children'錯誤提示

.NET週刊【5月第3期 2024-05-19】

(開源) 寫了一個無代碼平臺 brick

2020年上半年數據庫系統工程師考試

昔日輝煌不再，PHP老矣，尚能飯否？

ImportError: Could not find the DLL(s) 'msvcp140_1.dll'.

python網絡爬蟲與信息採取之下載存儲數據（一）-----下載儲存媒體文件模板

網絡組成及相關術語

網絡類型

古典密碼 --- 實驗吧

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結