Python3爬蟲之五網頁下載器的幾種方法【Python使用cookie模擬登錄CSDN】

原創

2020-02-20 15:46

（1）直接請求

from urllib import request

# 目標網址
url = "http://www.zhihu.com"


# 直接請求
response = request.urlopen(url)


# 獲取請求的狀態碼，200表示成功
# 讀取內容
if(response.getcode() == 200):
    print(response.read())

（2）使用Request添加data、http header等數據

from urllib import request

# 目標網址
url = "http://www.zhihu.com"

# 需要添加的數據
header = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36"}

# 創建Request對象
req = request.Request(url, headers=header)

# 發送請求
res = request.urlopen(req)

# # 獲取請求的狀態碼，200表示成功
# # 讀取內容
if(res.getcode() == 200):
    print(res.read())

（3）利用cookies模擬登錄我的CSDN博客

import urllib
import re
from urllib import request
import http.cookiejar

# 目標網址
url = 'https://passport.csdn.net'

# 創建cookie容器
cookie = http.cookiejar.CookieJar()

# 創建一個opener
opener = request.build_opener(request.HTTPCookieProcessor(cookie))

# 添加http header
opener.addheaders = [('User-agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36')]

# 需要封裝的數據
h = opener.open(url).read().decode("utf8")
pattern1 = re.compile(r'name="lt" value="(.*?)"')
pattern2 = re.compile(r'name="execution" value="(.*?)"')
b1 = pattern1.search(h)
b2 = pattern2.search(h)
post_data = {
    'username':'***',
    'password':'***',
    'lt': b1.group(1),
    'execution': b2.group(1),
    '_eventId': 'submit',
}
post_data = urllib.parse.urlencode(post_data).encode('utf-8')

# 使用帶cookie的urllib訪問網頁
res = opener.open(url, post_data)
# text = res.read().decode('utf-8')
# print(text)

res2 = opener.open('http://my.csdn.net/my/mycsdn')
text2 = res2.read().decode('utf-8')
print(text2)

行者小朱

發佈了215 篇原創文章 · 獲贊 192 · 訪問量 72萬+

他的留言板關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Python3爬蟲之五網頁下載器的幾種方法【Python使用cookie模擬登錄CSDN】

python gdal 安裝使用（Windows， python 3.6.8）

Python3爬蟲之二網頁解析【爬取自己CSDN博客信息】

Java爬蟲系列之三模擬瀏覽器【模塊瀏覽OSChina網站】

日誌記錄方法---SLF4J和Log4j

日誌規範建設

Kafka知識點總結二

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結