獲取文件內容
import requests
respone = requests.request('get','http://www.5173.com/')
# 響應碼
status_code = respone.status_code
# 文件編碼
encoding = respone.encoding
# 文件內容
text = respone.text
# 文件二進制內容
content = respone.content
print(text)
執行結果:
提取網頁內容(xpath方式)
import lxml.html
root=lxml.html.fromstring(text)
title = root.xpath('//title/text()')
print(title[0])
執行結果:
提取網頁內容有很多方式,
bs4
,re
xpath
…