說到美女,第一個想到的就是美女雲集的相親網站了。所以今天也是選取某個相親網站作爲素材,爬取美女圖片。
1、準備工作
首先需要一個相親網站的賬號,我這裏選取的是“我主良緣”。註冊登陸就可以了:
登陸後界面大致如上,填一些篩選條件,然後點搜緣分,就是我們要的結果了。但是我們要做的是爬取其中的美女圖片,我們右擊->檢查->Network,然後我們再點一下搜緣分,發現多了下面這條東西:
我們點一下,查看一下header中有什麼:
其它我們都不需要看了,我們直接看一下這個網址。就是一個api,哈哈這就是我們要的美女圖片api了。API如下:http://www.7799520.com/api/user/pc/list/search?startage=21&endage=30&gender=2&startheight=151&endheight=160&marry=1&salary=2&page=1
2、API解析
我們可以從URL中分析出這個API的參數,主要參數如下:
參數 | 參數類型 |
---|---|
startage | int |
endage | int |
gender | int |
startheight | int |
endheight | int |
salary | int |
page | int |
對於這些參數哪些是必要的哪些是非必要的這個可以自己試出來,對參數值的限定也可以自己試試。因爲博主比較年輕,所以今天測試的是21-30歲、身高151-160的女性。這個你們可以根據自己的愛好修改😄。
3、Json數據分析
在測試之後,發現上面的API返回的數據爲Json數據,返回數據如下:
{ "data": { "list": [ { "avatar": "http://img.7799520.com/2019-11-27-1574867191-MXUdY0Fc.png", "birthdayyear": "1994", "city": "上海", "education": "初中", "gender": "2", "height": "159", "marry": "未婚", "monolog": "願得一人心,白首不相離", "monologflag": "1", "province": "上海", "salary": "5千-1萬", "userid": "3018330", "username": "單身笑山嵐" }, { "avatar": "http://img.7799520.com/FhTV65n3mQ-X-PjfR3W9OpsFs5SO", "birthdayyear": "1991", "city": "北京", "education": "本科", "gender": "2", "height": "160", "marry": "未婚", "monolog": "土生土長北京人一枚,91年底小天蠍~lxt1103程序猿,高薪資,沒房有車小有存款~胖胖噠還不高,唉:-(喜愛旅遊,美食,旅遊喫美食~想找個喜歡運動的小哥哥陪我減肥,或者不介意胖姑娘的男生哦~男孩子最好也是北京的,這樣共同話題多,不能離北京太遠了,趕春運也很痛苦的希望你是個逗比或者心思靈巧的藍孩紙,在一起開心快樂聊得來就很幸福了", "monologflag": "-1", "province": "北京", "salary": "2萬-5萬", "userid": "3018171", "username": "桐桐桐桐桐" }, { "avatar": "http://img.7799520.com/00d0ba6e-5807-44fd-88af-eb379b325835", "birthdayyear": "1991", "city": "深圳", "education": "高中", "gender": "2", "height": "155", "marry": "未婚", "monolog": "如果真心實意可以加微信you02457本人對年齡要求30--35", "monologflag": "-1", "province": "廣東", "salary": "1萬-2萬", "userid": "3017206", "username": "(坦誠相待)" }, { "avatar": "http://img.7799520.com/2019-11-27-1574817016-6JBhbUyU.png", "birthdayyear": "1989", "city": "西安", "education": "大專", "gender": "2", "height": "160", "marry": "未婚", "monolog": "再晚也要嫁給愛情", "monologflag": "-2", "province": "陝西", "salary": "2千-5千", "userid": "3015509", "username": "Best媛" }, { "avatar": "http://img.7799520.com/0e1ed4fa3b5ca22ed120bf08a452506b53c0da49-2019-11-27-15748275951574827595051-hSw85JrS.png", "birthdayyear": "1995", "city": "上海", "education": "碩士", "gender": "2", "height": "155", "marry": "未婚", "monolog": "這個真的不知道咋寫哇......爹媽每天催婚.....算是獨白嗎...", "monologflag": "1", "province": "上海", "salary": "2萬-5萬", "userid": "3014896", "username": "。。。123" }, { "avatar": "http://img.7799520.com/f9e573e4-728a-4a05-8abd-9688c6d1c156", "birthdayyear": "1997", "city": "寧波", "education": "初中", "gender": "2", "height": "160", "marry": "未婚", "monolog": "願得一人心,白首不分離,15058276626", "monologflag": "-1", "province": "浙江", "salary": "2千-5千", "userid": "3014476", "username": "季節嬌氣" }, { "avatar": "http://img.7799520.com/8c328b6a-f34a-4d91-a869-10f6e47627e9", "birthdayyear": "1992", "city": "深圳", "education": "初中", "gender": "2", "height": "158", "marry": "未婚", "monolog": "願得一人心,白首不分離我微信號chen123456qing", "monologflag": "-1", "province": "廣東", "salary": "5千-1萬", "userid": "3013067", "username": "音響回眸勤奮" }, { "avatar": "http://img.7799520.com/9f74fb99444547a1408575c346008f22ac4bb1f7-2019-11-25-15746785901574678589876-kHZrSfnc.png", "birthdayyear": "1992", "city": "濟南", "education": "大專", "gender": "2", "height": "160", "marry": "未婚", "monolog": "也許我很平凡,但是我絕不缺乏生活的熱情和生命的夢想,也許我會孤單,但是我會一路找尋你的蹤跡。遇見你,將是我生命中最絢爛的時刻。", "monologflag": "1", "province": "山東", "salary": "5千-1萬", "userid": "3009076", "username": "驕傲的貓大王" }, { "avatar": "http://img.7799520.com/7da0c781-3115-467f-9fcc-d46d2aa1bb4a", "birthdayyear": "1994", "city": "國外", "education": "高中", "gender": "2", "height": "155", "marry": "未婚", "monolog": "我有一壺酒,足以慰風塵", "monologflag": "1", "province": "國外", "salary": "2千-5千", "userid": "3007139", "username": "墨染." }, { "avatar": "http://img.7799520.com/2019-11-24-1574575893-JYE0Y9nz.png", "birthdayyear": "1994", "city": "北海", "education": "大專", "gender": "2", "height": "157", "marry": "未婚", "monolog": "願得一人心,白首不相離,非會員哦,所以很多信息都看不到呢,抱歉", "monologflag": "1", "province": "廣西", "salary": "5千-1萬", "userid": "3006914", "username": "蔓鯨" }, { "avatar": "http://img.7799520.com/2019-11-24-1574565615-2p6Q37YC.png", "birthdayyear": "1995", "city": "廣州", "education": "本科", "gender": "2", "height": "160", "marry": "未婚", "monolog": "如果在一起是因爲合適,那希望是合適一輩子。", "monologflag": "1", "province": "廣東", "salary": "5千-1萬", "userid": "3006237", "username": "長頸鹿向淡淡" }, { "avatar": "http://img.7799520.com/4c69af45f1f9763bc33b7322cd025c90157a93b9-2019-11-23-15745152791574515278714-5F2a7dhi.png", "birthdayyear": "1997", "city": "上海", "education": "大專", "gender": "2", "height": "158", "marry": "未婚", "monolog": "好看的皮囊千篇一律,有趣的靈魂萬里挑一。。。", "monologflag": "1", "province": "上海", "salary": "1萬-2萬", "userid": "3004596", "username": "solely" }, { "avatar": "http://img.7799520.com/aaf297dd-af30-48de-8027-5c7e57ec2cdc", "birthdayyear": "1993", "city": "深圳", "education": "高中", "gender": "2", "height": "155", "marry": "未婚", "monolog": "在現在快節奏的社會,忙碌的工作之餘,希望有個知心人陪伴,偶爾逛街,看電影喫飯,一起旅遊,運動,分享彼此的喜怒哀樂,希望相互欣賞,包容,理解。我認爲最好的愛情莫過於爲彼此成爲最好的自己,成爲最默契的搭檔,一起發現這個世界的美好。", "monologflag": "1", "province": "廣東", "salary": "5千-1萬", "userid": "3003499", "username": "一木木" }, { "avatar": "http://img.7799520.com/2019-11-22-1574436265-oOHCA0Pi.png", "birthdayyear": "1991", "city": "上海", "education": "高中", "gender": "2", "height": "153", "marry": "未婚", "monolog": "愛喫西瓜的跳舞女少年?", "monologflag": "-2", "province": "上海", "salary": "5千-1萬", "userid": "3001594", "username": "西瓜西瓜瓜" }, { "avatar": "http://img.7799520.com/6351f7c2-734d-484f-95ae-7881b3b65132", "birthdayyear": "1996", "city": "南昌", "education": "中專", "gender": "2", "height": "158", "marry": "未婚", "monolog": "事事有迴應,漸漸有着落", "monologflag": "1", "province": "江西", "salary": "2千-5千", "userid": "2999190", "username": "977" }, { "avatar": "http://img.7799520.com/bc692905b97d0deeb6df0f73356d3de82b1d6261-2019-11-23-15745076641574507664470-zV4AFL8O.png", "birthdayyear": "1990", "city": "成都", "education": "大專", "gender": "2", "height": "156", "marry": "未婚", "monolog": "在成都的東北人!照片是很多年前的了。不喜歡拍照所以沒有現在的照片!我身高155體重42公斤。不喜歡:屬羊的男生,最好不抽菸不喝酒!我屬蛇天蠍座♏️", "monologflag": "1", "province": "四川", "salary": "2千-5千", "userid": "2998289", "username": "水壺苦戀無語" }, { "avatar": "http://img.7799520.com/7384954e-5c0d-4a5c-92c6-4493ba1be3d4", "birthdayyear": "1995", "city": "蘇州", "education": "大專", "gender": "2", "height": "160", "marry": "未婚", "monolog": "嗨 你好 能帶給我一份超大杯快樂嘛", "monologflag": "1", "province": "江蘇", "salary": "2千-5千", "userid": "2991868", "username": "小呀麼小靜靜" }, { "avatar": "http://img.7799520.com/FlUJTeR0REKbLhtoR5RNVeuOXRy1", "birthdayyear": "1992", "city": "蘇州", "education": "大專", "gender": "2", "height": "160", "marry": "未婚", "monolog": "愛好看動漫和小說,比較宅,做事喜歡有計劃,喜歡獨處,自在。理想伴侶就是要有穩定的工作。。。。", "monologflag": "1", "province": "江蘇", "salary": "2千-5千", "userid": "2989769", "username": "青一木" }, { "avatar": "http://img.7799520.com/edbb6516-2b07-401e-b56e-6aee6c2620ca", "birthdayyear": "1994", "city": "巴中", "education": "高中", "gender": "2", "height": "156", "marry": "未婚", "monolog": "我是找對象的,感覺我還行的,可以加JC718829", "monologflag": "-1", "province": "四川", "salary": "5千-1萬", "userid": "2989629", "username": "星願回首悲涼" }, { "avatar": "http://img.7799520.com/e648d317faacffb4f03b1ca31fdbed2b4c6ec5e4-2019-11-25-15746776401574677640473-C6Z1QX0K.png", "birthdayyear": "1990", "city": "深圳", "education": "初中", "gender": "2", "height": "159", "marry": "未婚", "monolog": "願得一人心,白首不相離", "monologflag": "1", "province": "廣東", "salary": "2千-5千", "userid": "2988102", "username": "蘭瑪珊蒂" } ], "num": 20, "page": 1 }, "error_code": 0}
我們可以分析這個結構來獲取自己需要的信息。
4、代碼講解
如果使用過爬蟲一般都會覺得Python的爬蟲是非常簡單的,正如標題所言,只需要10行代碼,代碼如下:
import requests #導入request包
dir = 'C:/Users/zaxwz/Desktop/xqImg/' #用來存儲圖片的文件夾路徑
#圖片的url,我這裏page沒給參數,爲了方便後面換頁
url = 'http://www.7799520.com/api/user/pc/list/search?startage=21&endage=30&gender=2&startheight=151&endheight=160&marry=1&salary=3&page='
#用循環,爬取40頁的美女
for i in range(40):
#其返回值爲json數據,直接獲取其json字典
jsonData = requests.get(url + str(i+1)).json()
#通過jsonData['data']['list']獲取美女列表
for j in jsonData['data']['list']:
#其中j['avatar']爲圖片網址
imgUrl = j['avatar']
#發送網絡請求
resp = requests.get(imgUrl)
#創建圖片文件,並將流寫入圖片
img = open(dir + j['username'] + '.jpg', 'wb')
img.write(resp.content)
這樣爬取美女圖片就完成了,去掉註釋的話正好是10行代碼。爬取圖片如下: