python_爬蟲_百度地圖遷徙的總體思路:
1、獲取百度地圖的城市編號和城市名對應關係
2、獲取你想抓取的日期
3、使用request發送請求
4、將其寫入到json文件中
遺憾的是百度地圖只有每個城市遷入遷出佔比,沒有具體人數
import os
from datetime import datetime
from urllib import request
from utils.read_write import readTXT, writeOneJSON
from utils.time_change import getBetweenDay
os.chdir(r'D:\data\據\json_328_0602\\')
row0 = [u'遷入城市',u'所在城市',u'佔比',u'遷出城市',u'所在城市',u'佔比',u'遷入省份',u'所在城市',u'佔比',u'遷出省份',u'所在城市',u'佔比']
# 把txt文件讀取成字符串數組
lines = readTXT('D:\project\jianguiyuan\data\BaiduMap_cityCode_1102.txt')
# 發送請求
def requerts_url(url,i,riqi):
try:
response = request.urlopen(url).read()
return response
except:
print(datetime.now())
print(i)
print(url)
city_range(i,riqi)
date_change(int(riqi)+1)
# 先將數據下載爲json文件
def city_range(n,riqi):
for i in range(n, 327):
print(i)
# 把城市id號和城市名分開
obj = lines[i].split(',')
print(obj[1])
firsturl = "http://huiyan.baidu.com/migration/cityrank.jsonp?dt=city&id=" + obj[
0] + "&type=move_in&date=" + riqi + "&callback=jsonp_1584195671576_1286958"
data = requerts_url(firsturl,i,riqi)
# 將數據解碼至中文
data = data.encode("utf-8")
# 寫json文件
writeOneJSON(data, "城市遷入_" + obj[1] + "_" + riqi + ".json")
firsturl = "http://huiyan.baidu.com/migration/cityrank.jsonp?dt=city&" \
"id=" + obj[0] + "&type=move_out&date=" + riqi + "&callback=jsonp_1584195671576_1286958"
data2 = requerts_url(firsturl,i,riqi)
data2 = data2.encode("utf-8")
writeOneJSON(data2, "城市遷出_" + obj[1] + "_" + riqi + ".json")
firsturl = "http://huiyan.baidu.com/migration/provincerank.jsonp?dt=city&id=" + obj[
0] + "&type=move_in&date=" + riqi + "&callback=jsonp_1584195671576_1286958"
data = requerts_url(firsturl,i,riqi)
data = data.encode("utf-8").decode("unicode_escape")
writeOneJSON(data, "省份遷入_" + obj[1] + "_" + riqi + ".json")
firsturl = "http://huiyan.baidu.com/migration/provincerank.jsonp?dt=city&" \
"id=" + obj[0] + "&type=move_out&date=" + riqi + "&callback=jsonp_1584195671576_1286958"
data2 = requerts_url(firsturl,i,riqi)
data2 = data2.encode("utf-8").decode("unicode_escape") #
writeOneJSON(data2, "省份遷出_" + obj[1] + "_" + riqi + ".json")
def date_change(date):
date_list= getBetweenDay(date)
for riqi in date_list:
riqi = riqi.replace('-','')
print(riqi)
city_range(1,riqi)
print("大吉大利,今晚喫雞啊!")
if __name__ == '__main__':
date_change('2020-04-01')
我已經有從今年1月到6月的所有數據,還有矩陣形式的,如需數據請私聊我。。。
遺憾的是百度地圖只有每個城市遷入遷出佔比,沒有具體人數
如需幫忙爬取全國省份城市遷移的具體人數,請私聊我,我有辦法。。。