python crawler - Session模擬表單登陸並下載登錄後用戶頭像demo

原創

2020-07-02 04:02

要登錄的網站:https://www.1point3acres.com/bbs/
找到form中的action查看提交表單的目的地址：
https://www.1point3acres.com/bbs/member.php?mod=logging&action=login&loginsubmit=yes&infloat=yes&lssubmit=yes&inajax=1

登錄後，查看表單數據作爲提交參數：

最後就是查看頭像的位置：

利用BeautifulSoup先找到div，在獲取其子節點得到img中的src屬性

import requests
from bs4 import BeautifulSoup

header = {
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36'
}

form_data = {
    'username' : 'dave_lzw2020',
    'password' : "Password123456.",
    'quickforward' : 'yes',
    'handlekey' : 'ls'
}

session = requests.Session()

html = session.post(
    'https://www.1point3acres.com/bbs/member.php?mod=logging&action=login&loginsubmit=yes&infloat=yes&lssubmit=yes&inajax=1',
    headers=header,
    data=form_data
)

# print(html.text)

resp = session.get('https://www.1point3acres.com/bbs/',headers=header).text

# print(resp)

ht = BeautifulSoup(resp,'lxml')

div_node = ht.find('div',{'class':'avt y'})

print(div_node)
chnodes = div_node.children
print(chnodes)

img_src = [chnode.find('img')['src'] for chnode in chnodes if chnode.find('img') is not None]

print(img_src)

for src in img_src:
    img_content = session.get(src,headers=header,verify=False).content
    src = src.lstrip('https://').replace(r'/','-')
    print(src)
    with open('{src}.jpg'.format_map(vars()) , 'wb+') as f :
        f.write(img_content)

# vars() : 返回對象object的屬性和屬性值的字典對象，如果沒有參數，就打印當前調用位置的屬性和屬性值 類似locals()

報錯及注意事項：

1.form_data填寫務必正確，不然登陸失敗後訪問用戶頁面一直顯示
Access denied | www.1point3acres.com used Cloudflare to restrict,讓我一直在找如何繞過Cloudflare,
後面將post返回的頁面打印出來才發現是密碼輸入錯誤，根本沒有登陸成功。

2.報錯：[SSL: CERTIFICATE_VERIFY_FAILED],在get裏面加一個verify=False即可。如下：
img_content = session.get(src,headers=header,verify=False).content

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

python crawler - Session模擬表單登陸並下載登錄後用戶頭像demo

python crawler - 使用代理增加博客文章訪問量

python crawler - Session模擬表單登陸並下載登錄後用戶頭像demo

python - matplotlib demo

Pandas - Series、DataFrame、plot（demo

python-對docx文檔操作demo + word批量轉pdf 及[AttributeError]解決方案

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結