爬蟲學習筆記(十八)模擬登錄 2020.5.22

前言

本節學習模擬登錄

cookies和session的區別:

  • cookie數據存放在客戶的瀏覽器上,session數據放在服務器上;
  • cookie不是很安全,別人可以分析存放在本地的COOKIE並進行COOKIE欺騙,考慮到安全應當使用session;
  • session會增加服務器的負載;

1、post請求

import requests
data = {'name': 'germey', 'age': '22'}
r = requests.post("http://httpbin.org/post", data=data)
print(r.text)

2、模擬登錄馬蜂窩

import requests
from lxml import etree
session = requests.Session()
phone_number = input('電話')
password = input('密碼')
data = {'passport': phone_number, 'password': password}
header = {
        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 '
                      '(KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'
    }
r = session.post("https://passport.mafengwo.cn/login/", headers=header, data=data)
print(r.status_code)
logined_url = 'http://www.mafengwo.cn/friend/index/follow?uid=70360114'
response = session.get(logined_url, headers=header)
print(response.status_code)
tree = etree.HTML(response.text)
friends = tree.xpath('//div[@class="name"]/a/text()')
print(friends)

3、利用cookies登錄馬蜂窩

import requests
from lxml import etree
session = requests.Session()
phone_number = '13521093039'
password = 'pro123,./'
data = {'passport': phone_number, 'password': password}
header = {
        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 '
                      '(KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'
    }
r = session.post("https://passport.mafengwo.cn/login/", headers=header, data=data)
cookies = session.cookies
# print(cookies)
print(cookies.get_dict())
logined_url = 'http://www.mafengwo.cn/friend/index/follow?uid=70360114'
response = requests.get(logined_url, headers=header, cookies=cookies)
print(response.status_code)
tree = etree.HTML(response.text)
friends = tree.xpath('//div[@class="name"]/a/text()')
print(friends)

4、使用自造的cookies登錄馬蜂窩

import requests
from lxml import etree
str = 'mfw_uuid=5bcfcc20-b235-fbbe-c1d6-ae01e1f68d82; _r=baidu; _rp=a%3A2%3A%7Bs%3A1%3A%22p%22%3Bs%3A19%3A%22www.baidu.com%2Fbaidu%22%3Bs%3A1%3A%22t%22%3Bi%3A1540344864%3B%7D; __mfwlv=1544535825; __mfwvn=6; __mfwlt=1544537333; uva=s%3A190%3A%22a%3A4%3A%7Bs%3A13%3A%22host_pre_time%22%3Bs%3A10%3A%222018-10-24%22%3Bs%3A2%3A%22lt%22%3Bi%3A1540344865%3Bs%3A10%3A%22last_refer%22%3Bs%3A64%3A%22https%3A%2F%2Fwww.baidu.com%2Fbaidu%3Ftn%3Dmonline_3_dg%26ie%3Dutf-8%26wd%3Dmafengwo%22%3Bs%3A5%3A%22rhost%22%3Bs%3A13%3A%22www.baidu.com%22%3B%7D%22%3B; __mfwurd=a%3A3%3A%7Bs%3A6%3A%22f_time%22%3Bi%3A1540344865%3Bs%3A9%3A%22f_rdomain%22%3Bs%3A13%3A%22www.baidu.com%22%3Bs%3A6%3A%22f_host%22%3Bs%3A3%3A%22www%22%3B%7D; __mfwuuid=5bcfcc20-b235-fbbe-c1d6-ae01e1f68d82; UM_distinctid=166adc88c9b1a0-0e2508b4f33a838-4a506a-1fa400-166adc88c9c6e7; CNZZDATA30065558=cnzz_eid%3D2019380672-1540512335-null%26ntime%3D1540512335; oad_n=a%3A3%3A%7Bs%3A3%3A%22oid%22%3Bi%3A1029%3Bs%3A2%3A%22dm%22%3Bs%3A20%3A%22passport.mafengwo.cn%22%3Bs%3A2%3A%22ft%22%3Bs%3A19%3A%222018-12-11+21%3A43%3A36%22%3B%7D; __today_login=1; uol_throttle=70360114; PHPSESSID=sou2c85ea8lhrirrq8dmaflhf3'
str_list = str.split(';')
print(str_list)
cookies = {}
for item in str_list:
    # print(item)
    key = item.split('=')[0].strip()
    value = item.split('=')[1].strip()
    cookies[key] = value
print(cookies)
header = {
        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 '
                      '(KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'
    }
logined_url = 'http://www.mafengwo.cn/friend/index/follow?uid=70360114'
response = requests.get(logined_url, headers=header, cookies=cookies)
print(response.status_code)
tree = etree.HTML(response.text)
friends = tree.xpath('//div[@class="name"]/a/text()')
print(friends)

結語

嘗試模擬登錄
涉及驗證碼可參考:
驗證碼登錄

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章