文章目錄
- 【1x00】前言
- 【2x00】思維導圖
- 【3x00】數據結構分析
- 【4x00】主函數 main()
- 【5x00】數據獲取模塊 data_get
- 【5x01】初始化函數 init()
- 【5x02】中國總數據 china_total_data()
- 【5x03】全球總數據 global_total_data()
- 【5x04】中國每日數據 china_daily_data()
- 【5x05】境外每日數據 foreign_daily_data()
- 【6x00】詞雲圖繪製模塊 data_wordcloud
- 【7x00】地圖繪製模塊 data_map
- 【7x01】中國累計確診地圖 china_total_map()
- 【7x02】全球累計確診地圖 global_total_map()
- 【7x03】中國每日數據折線圖 china_daily_map()
- 【7x04】境外每日數據折線圖 foreign_daily_map()
- 【8x00】結果截圖
- 【9x00】完整代碼
這裏是一段防爬蟲文本,請讀者忽略。
本文原創首發於 CSDN,作者 TRHX。
博客首頁:https://itrhx.blog.csdn.net/
本文鏈接:https://itrhx.blog.csdn.net/article/details/107140534
未經授權,禁止轉載!惡意轉載,後果自負!尊重原創,遠離剽竊!
【1x00】前言
本來兩三個月之前就想搞個疫情數據實時數據展示的,由於各種不可抗拒因素一而再再而三的鴿了,最近終於抽空寫了一個,數據是用 Python 爬取的百度疫情實時大數據報告,請求庫用的 requests,解析用的 Xpath 語法,詞雲用的 wordcloud 庫,數據可視化用 pyecharts 繪製的地圖和折線圖,數據儲存在 Excel 表格裏面,使用 openpyxl 對錶格進行處理。
本程序實現了累計確診地圖展示和每日數據變化折線圖展示,其他更多數據的獲取和展示均可在程序中進行拓展,可以將程序部署在服務器上,設置定時運行,即可實時展示數據,pyecharts 繪圖模塊也可以整合到 Web 框架(Django、Flask等)中使用。
注意項:在獲取數據時有全球和境外兩個概念,全球包含中國,境外不包含中國,後期繪製的四個圖:中國累計確診地圖、全球累計確診地圖(包含中國)、中國每日數據折線圖、境外每日數據折線圖(不包含中國)。
-
pyecharts 文檔:https://pyecharts.org/
-
openpyxl 文檔:https://openpyxl.readthedocs.io/
-
wordcloud 文檔:http://amueller.github.io/word_cloud/
【2x00】思維導圖
【3x00】數據結構分析
通過查看百度的疫情數據頁面,可以看到很多整齊的數據,猜測就是疫情相關的數據,保存該頁面,對其進行格式化,很容易可以分析出所有的數據都在 <script type="application/json" id="captain-config"></script>
裏面,其中 title 裏面是一些 Unicode 編碼,將其轉爲中文後更容易得到不同的分類數據。
由於數據繁多,可以將數據主體部分提取出來,刪除一些重複項和其他雜項,留下數據大體位置並分析數據結構,便於後期的數據提取,經過處理後的數據大致結構如下:
<script type="application/json" id="captain-config">
{
"component": [
{
"mapLastUpdatedTime": "2020.07.05 16:13", // 國內疫情數據最後更新時間
"caseList": [ // caseList 列表,每一個元素是一個字典
{
"confirmed": "1", // 每個字典包含中國每個省的每一項疫情數據
"died": "0",
"crued": "1",
"relativeTime": "1593792000",
"confirmedRelative": "0",
"diedRelative": "0",
"curedRelative": "0",
"curConfirm": "0",
"curConfirmRelative": "0",
"icuDisable": "1",
"area": "西藏",
"subList": [ // subList 列表,每一個元素是一個字典
{
"city": "拉薩", // 每個字典包含該省份對應的每個城市疫情數據
"confirmed": "1",
"died": "0",
"crued": "1",
"confirmedRelative": "0",
"curConfirm": "0",
"cityCode": "100"
}
]
}
],
"caseOutsideList": [ // caseOutsideList 列表,每一個元素是一個字典
{
"confirmed": "241419", // 每個字典包含各國的每一項疫情數據
"died": "34854",
"crued": "191944",
"relativeTime": "1593792000",
"confirmedRelative": "223",
"curConfirm": "14621",
"icuDisable": "1",
"area": "意大利",
"subList": [ // subList 列表,每一個元素是一個字典
{
"city": "倫巴第", // 每個字典包含每個國家對應的每個城市疫情數據
"confirmed": "94318",
"died": "16691",
"crued": "68201",
"curConfirm": "9426"
}
]
}
],
"summaryDataIn": { // summaryDataIn 國內總的疫情數據
"confirmed": "85307",
"died": "4648",
"cured": "80144",
"asymptomatic": "99",
"asymptomaticRelative": "7",
"unconfirmed": "7",
"relativeTime": "1593792000",
"confirmedRelative": "19",
"unconfirmedRelative": "1",
"curedRelative": "27",
"diedRelative": "0",
"icu": "6",
"icuRelative": "0",
"overseasInput": "1931",
"unOverseasInputCumulative": "83375",
"overseasInputRelative": "6",
"unOverseasInputNewAdd": "13",
"curConfirm": "515",
"curConfirmRelative": "-8",
"icuDisable": "1"
},
"summaryDataOut": { // summaryDataOut 國外總的疫情數據
"confirmed": "11302569",
"died": "528977",
"curConfirm": "4410601",
"cured": "6362991",
"confirmedRelative": "206165",
"curedRelative": "190018",
"diedRelative": "4876",
"curConfirmRelative": "11271",
"relativeTime": "1593792000"
},
"trend": { // trend 字典,包含國內每日的疫情數據
"updateDate": [], // 日期
"list": [ // list 列表,每項數據及其對應的值
{
"name": "確診",
"data": []
},
{
"name": "疑似",
"data": []
},
{
"name": "治癒",
"data": []
},
{
"name": "死亡",
"data": []
},
{
"name": "新增確診",
"data": []
},
{
"name": "新增疑似",
"data": []
},
{
"name": "新增治癒",
"data": []
},
{
"name": "新增死亡",
"data": []
},
{
"name": "累計境外輸入",
"data": []
},
{
"name": "新增境外輸入",
"data": []
}
]
},
"foreignLastUpdatedTime": "2020.07.05 16:13", // 國外疫情數據最後更新時間
"globalList": [ // globalList 列表,每一個元素是一個字典
{
"area": "亞洲", // 按照不同洲進行分類
"subList": [ // subList 列表,每個洲各個國家的疫情數據
{
"died": "52",
"confirmed": "6159",
"crued": "4809",
"curConfirm": "1298",
"confirmedRelative": "0",
"relativeTime": "1593792000",
"country": "塔吉克斯坦"
}
],
"died": "56556", // 每個洲總的疫情數據
"crued": "1625562",
"confirmed": "2447873",
"curConfirm": "765755",
"confirmedRelative": "60574"
},
{
"area": "其他", // 其他特殊區域疫情數據
"subList": [
{
"died": "13",
"confirmed": "712",
"crued": "651",
"curConfirm": "48",
"confirmedRelative": "0",
"relativeTime": "1593792000",
"country": "鑽石公主號郵輪"
}
],
"died": "13", // 其他特殊區域疫情總的數據
"crued": "651",
"confirmed": "712",
"curConfirm": "48",
"confirmedRelative": "0"
},
{
"area": "熱門", // 熱門國家疫情數據
"subList": [
{
"died": "5206",
"confirmed": "204610",
"crued": "179492",
"curConfirm": "19912",
"confirmedRelative": "1172",
"relativeTime": "1593792000",
"country": "土耳其"
}
],
"died": "528967", // 熱門國家疫情總的數據
"crued": "6362924",
"confirmed": "11302357",
"confirmedRelative": "216478",
"curConfirm": "4410466"
}],
"allForeignTrend": { // allForeignTrend 字典,包含國外每日的疫情數據
"updateDate": [], // 日期
"list": [ // list 列表,每項數據及其對應的值
{
"name": "累計確診",
"data": []
},
{
"name": "治癒",
"data": []
},
{
"name": "死亡",
"data": []
},
{
"name": "現有確診",
"data": []
},
{
"name": "新增確診",
"data": []
}
]
},
"topAddCountry": [ // 確診增量最高的國家
{
"name": "美國",
"value": 53162
}
],
"topOverseasInput": [ // 境外輸入最多的省份
{
"name": "黑龍江",
"value": 386
}
]
}
]
}
</script>
【4x00】主函數 main()
分別將數據獲取、詞雲圖繪製、地圖繪製寫入三個文件:data_get()
、data_wordcloud()
、data_map()
,然後使用一個主函數文件 main.py 來調用這三個文件裏面的函數。
import data_get
import data_wordcloud
import data_map
data_dict = data_get.init()
data_get.china_total_data(data_dict)
data_get.global_total_data(data_dict)
data_get.china_daily_data(data_dict)
data_get.foreign_daily_data(data_dict)
data_wordcloud.china_wordcloud()
data_wordcloud.global_wordcloud()
data_map.all_map()
【5x00】數據獲取模塊 data_get
【5x01】初始化函數 init()
使用 xpath 語法 //script[@id="captain-config"]/text()
提取裏面的值,利用 json.loads
方法將其轉換爲字典對象,以便後續的其他函數調用。
def init():
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.13 Safari/537.36'
}
url = 'https://voice.baidu.com/act/newpneumonia/newpneumonia/'
response = requests.get(url=url, headers=headers)
tree = etree.HTML(response.text)
dict1 = tree.xpath('//script[@id="captain-config"]/text()')
print(type(dict1[0]))
dict2 = json.loads(dict1[0])
return dict2
【5x02】中國總數據 china_total_data()
def china_total_data(data):
"""
1、中國省/直轄市/自治區/行政區疫情數據
省/直轄市/自治區/行政區:area
現有確診: curConfirm
累計確診: confirmed
累計治癒: crued
累計死亡: died
現有確診增量: curConfirmRelative
累計確診增量: confirmedRelative
累計治癒增量: curedRelative
累計死亡增量: diedRelative
"""
wb = openpyxl.Workbook() # 創建工作簿
ws_china = wb.active # 獲取工作表
ws_china.title = "中國省份疫情數據" # 命名工作表
ws_china.append(['省/直轄市/自治區/行政區', '現有確診', '累計確診', '累計治癒',
'累計死亡', '現有確診增量', '累計確診增量',
'累計治癒增量', '累計死亡增量'])
china = data['component'][0]['caseList']
for province in china:
ws_china.append([province['area'],
province['curConfirm'],
province['confirmed'],
province['crued'],
province['died'],
province['curConfirmRelative'],
province['confirmedRelative'],
province['curedRelative'],
province['diedRelative']])
"""
2、中國城市疫情數據
城市:city
現有確診:curConfirm
累計確診:confirmed
累計治癒:crued
累計死亡:died
累計確診增量:confirmedRelative
"""
ws_city = wb.create_sheet('中國城市疫情數據')
ws_city.append(['城市', '現有確診', '累計確診',
'累計治癒', '累計死亡', '累計確診增量'])
for province in china:
for city in province['subList']:
# 某些城市沒有 curConfirm 數據,則將其設置爲 0,crued 和 died 爲空時,替換成 0
if 'curConfirm' not in city:
city['curConfirm'] = '0'
if city['crued'] == '':
city['crued'] = '0'
if city['died'] == '':
city['died'] = '0'
ws_city.append([city['city'], '0', city['confirmed'],
city['crued'], city['died'], city['confirmedRelative']])
"""
3、中國疫情數據更新時間:mapLastUpdatedTime
"""
time_domestic = data['component'][0]['mapLastUpdatedTime']
ws_time = wb.create_sheet('中國疫情數據更新時間')
ws_time.column_dimensions['A'].width = 22 # 調整列寬
ws_time.append(['中國疫情數據更新時間'])
ws_time.append([time_domestic])
wb.save('COVID-19-China.xlsx')
print('中國疫情數據已保存至 COVID-19-China.xlsx!')
【5x03】全球總數據 global_total_data()
全球總數據在提取完成後,進行地圖繪製時發現並沒有中國的數據,因此在寫入全球數據時注意要單獨將中國的數據插入 Excel 中。
def global_total_data(data):
"""
1、全球各國疫情數據
國家:country
現有確診:curConfirm
累計確診:confirmed
累計治癒:crued
累計死亡:died
累計確診增量:confirmedRelative
"""
wb = openpyxl.Workbook()
ws_global = wb.active
ws_global.title = "全球各國疫情數據"
# 按照國家保存數據
countries = data['component'][0]['caseOutsideList']
ws_global.append(['國家', '現有確診', '累計確診', '累計治癒', '累計死亡', '累計確診增量'])
for country in countries:
ws_global.append([country['area'],
country['curConfirm'],
country['confirmed'],
country['crued'],
country['died'],
country['confirmedRelative']])
# 按照洲保存數據
continent = data['component'][0]['globalList']
for area in continent:
ws_foreign = wb.create_sheet(area['area'] + '疫情數據')
ws_foreign.append(['國家', '現有確診', '累計確診', '累計治癒', '累計死亡', '累計確診增量'])
for country in area['subList']:
ws_foreign.append([country['country'],
country['curConfirm'],
country['confirmed'],
country['crued'],
country['died'],
country['confirmedRelative']])
# 在“全球各國疫情數據”和“亞洲疫情數據”兩張表中寫入中國疫情數據
ws1, ws2 = wb['全球各國疫情數據'], wb['亞洲疫情數據']
original_data = data['component'][0]['summaryDataIn']
add_china_data = ['中國',
original_data['curConfirm'],
original_data['confirmed'],
original_data['cured'],
original_data['died'],
original_data['confirmedRelative']]
ws1.append(add_china_data)
ws2.append(add_china_data)
"""
2、全球疫情數據更新時間:foreignLastUpdatedTime
"""
time_foreign = data['component'][0]['foreignLastUpdatedTime']
ws_time = wb.create_sheet('全球疫情數據更新時間')
ws_time.column_dimensions['A'].width = 22 # 調整列寬
ws_time.append(['全球疫情數據更新時間'])
ws_time.append([time_foreign])
wb.save('COVID-19-Global.xlsx')
print('全球疫情數據已保存至 COVID-19-Global.xlsx!')
【5x04】中國每日數據 china_daily_data()
def china_daily_data(data):
"""
i_dict = data['component'][0]['trend']
i_dict['updateDate']:日期
i_dict['list'][0]:確診
i_dict['list'][1]:疑似
i_dict['list'][2]:治癒
i_dict['list'][3]:死亡
i_dict['list'][4]:新增確診
i_dict['list'][5]:新增疑似
i_dict['list'][6]:新增治癒
i_dict['list'][7]:新增死亡
i_dict['list'][8]:累計境外輸入
i_dict['list'][9]:新增境外輸入
"""
ccd_dict = data['component'][0]['trend']
update_date = ccd_dict['updateDate'] # 日期
china_confirmed = ccd_dict['list'][0]['data'] # 每日累計確診數據
china_crued = ccd_dict['list'][2]['data'] # 每日累計治癒數據
china_died = ccd_dict['list'][3]['data'] # 每日累計死亡數據
wb = openpyxl.load_workbook('COVID-19-China.xlsx')
# 寫入每日累計確診數據
ws_china_confirmed = wb.create_sheet('中國每日累計確診數據')
ws_china_confirmed.append(['日期', '數據'])
for data in zip(update_date, china_confirmed):
ws_china_confirmed.append(data)
# 寫入每日累計治癒數據
ws_china_crued = wb.create_sheet('中國每日累計治癒數據')
ws_china_crued.append(['日期', '數據'])
for data in zip(update_date, china_crued):
ws_china_crued.append(data)
# 寫入每日累計死亡數據
ws_china_died = wb.create_sheet('中國每日累計死亡數據')
ws_china_died.append(['日期', '數據'])
for data in zip(update_date, china_died):
ws_china_died.append(data)
wb.save('COVID-19-China.xlsx')
print('中國每日累計確診/治癒/死亡數據已保存至 COVID-19-China.xlsx!')
【5x05】境外每日數據 foreign_daily_data()
def foreign_daily_data(data):
"""
te_dict = data['component'][0]['allForeignTrend']
te_dict['updateDate']:日期
te_dict['list'][0]:累計確診
te_dict['list'][1]:治癒
te_dict['list'][2]:死亡
te_dict['list'][3]:現有確診
te_dict['list'][4]:新增確診
"""
te_dict = data['component'][0]['allForeignTrend']
update_date = te_dict['updateDate'] # 日期
foreign_confirmed = te_dict['list'][0]['data'] # 每日累計確診數據
foreign_crued = te_dict['list'][1]['data'] # 每日累計治癒數據
foreign_died = te_dict['list'][2]['data'] # 每日累計死亡數據
wb = openpyxl.load_workbook('COVID-19-Global.xlsx')
# 寫入每日累計確診數據
ws_foreign_confirmed = wb.create_sheet('境外每日累計確診數據')
ws_foreign_confirmed.append(['日期', '數據'])
for data in zip(update_date, foreign_confirmed):
ws_foreign_confirmed.append(data)
# 寫入累計治癒數據
ws_foreign_crued = wb.create_sheet('境外每日累計治癒數據')
ws_foreign_crued.append(['日期', '數據'])
for data in zip(update_date, foreign_crued):
ws_foreign_crued.append(data)
# 寫入累計死亡數據
ws_foreign_died = wb.create_sheet('境外每日累計死亡數據')
ws_foreign_died.append(['日期', '數據'])
for data in zip(update_date, foreign_died):
ws_foreign_died.append(data)
wb.save('COVID-19-Global.xlsx')
print('境外每日累計確診/治癒/死亡數據已保存至 COVID-19-Global.xlsx!')
【6x00】詞雲圖繪製模塊 data_wordcloud
【6x01】中國累計確診詞雲圖 foreign_daily_data()
def china_wordcloud():
wb = openpyxl.load_workbook('COVID-19-China.xlsx') # 獲取已有的xlsx文件
ws_china = wb['中國省份疫情數據'] # 獲取中國省份疫情數據表
ws_china.delete_rows(1) # 刪除第一行
china_dict = {} # 將省份及其累計確診按照鍵值對形式儲存在字典中
for data in ws_china.values:
china_dict[data[0]] = int(data[2])
word_cloud = wordcloud.WordCloud(font_path='C:/Windows/Fonts/simsun.ttc',
background_color='#CDC9C9',
min_font_size=15,
width=900, height=500)
word_cloud.generate_from_frequencies(china_dict)
word_cloud.to_file('WordCloud-China.png')
print('中國省份疫情詞雲圖繪製完畢!')
【6x02】全球累計確診詞雲圖 foreign_daily_data()
def global_wordcloud():
wb = openpyxl.load_workbook('COVID-19-Global.xlsx')
ws_global = wb['全球各國疫情數據']
ws_global.delete_rows(1)
global_dict = {}
for data in ws_global.values:
global_dict[data[0]] = int(data[2])
word_cloud = wordcloud.WordCloud(font_path='C:/Windows/Fonts/simsun.ttc',
background_color='#CDC9C9',
width=900, height=500)
word_cloud.generate_from_frequencies(global_dict)
word_cloud.to_file('WordCloud-Global.png')
print('全球各國疫情詞雲圖繪製完畢!')
這裏是一段防爬蟲文本,請讀者忽略。
本文原創首發於 CSDN,作者 TRHX。
博客首頁:https://itrhx.blog.csdn.net/
本文鏈接:https://itrhx.blog.csdn.net/article/details/107140534
未經授權,禁止轉載!惡意轉載,後果自負!尊重原創,遠離剽竊!
【7x00】地圖繪製模塊 data_map
【7x01】中國累計確診地圖 china_total_map()
def china_total_map():
wb = openpyxl.load_workbook('COVID-19-China.xlsx') # 獲取已有的xlsx文件
ws_time = wb['中國疫情數據更新時間'] # 獲取文件中中國疫情數據更新時間表
ws_data = wb['中國省份疫情數據'] # 獲取文件中中國省份疫情數據表
ws_data.delete_rows(1) # 刪除第一行
province = [] # 省份
curconfirm = [] # 累計確診
for data in ws_data.values:
province.append(data[0])
curconfirm.append(data[2])
time_china = ws_time['A2'].value # 更新時間
# 設置分級顏色
pieces = [
{'max': 0, 'min': 0, 'label': '0', 'color': '#FFFFFF'},
{'max': 9, 'min': 1, 'label': '1-9', 'color': '#FFE5DB'},
{'max': 99, 'min': 10, 'label': '10-99', 'color': '#FF9985'},
{'max': 999, 'min': 100, 'label': '100-999', 'color': '#F57567'},
{'max': 9999, 'min': 1000, 'label': '1000-9999', 'color': '#E64546'},
{'max': 99999, 'min': 10000, 'label': '≧10000', 'color': '#B80909'}
]
# 繪製地圖
ct_map = (
Map()
.add(series_name='累計確診人數', data_pair=[list(z) for z in zip(province, curconfirm)], maptype="china")
.set_global_opts(
title_opts=opts.TitleOpts(title="中國疫情數據(累計確診)",
subtitle='數據更新至:' + time_china + '\n\n來源:百度疫情實時大數據報告'),
visualmap_opts=opts.VisualMapOpts(max_=300, is_piecewise=True, pieces=pieces)
)
)
return ct_map
【7x02】全球累計確診地圖 global_total_map()
def global_total_map():
wb = openpyxl.load_workbook('COVID-19-Global.xlsx')
ws_time = wb['全球疫情數據更新時間']
ws_data = wb['全球各國疫情數據']
ws_data.delete_rows(1)
country = [] # 國家
curconfirm = [] # 累計確診
for data in ws_data.values:
country.append(data[0])
curconfirm.append(data[2])
time_global = ws_time['A2'].value # 更新時間
# 國家名稱中英文映射表
name_map = {
"Somalia": "索馬里",
"Liechtenstein": "列支敦士登",
"Morocco": "摩洛哥",
"W. Sahara": "西撒哈拉",
"Serbia": "塞爾維亞",
"Afghanistan": "阿富汗",
"Angola": "安哥拉",
"Albania": "阿爾巴尼亞",
"Andorra": "安道爾共和國",
"United Arab Emirates": "阿拉伯聯合酋長國",
"Argentina": "阿根廷",
"Armenia": "亞美尼亞",
"Australia": "澳大利亞",
"Austria": "奧地利",
"Azerbaijan": "阿塞拜疆",
"Burundi": "布隆迪",
"Belgium": "比利時",
"Benin": "貝寧",
"Burkina Faso": "布基納法索",
"Bangladesh": "孟加拉國",
"Bulgaria": "保加利亞",
"Bahrain": "巴林",
"Bahamas": "巴哈馬",
"Bosnia and Herz.": "波斯尼亞和黑塞哥維那",
"Belarus": "白俄羅斯",
"Belize": "伯利茲",
"Bermuda": "百慕大",
"Bolivia": "玻利維亞",
"Brazil": "巴西",
"Barbados": "巴巴多斯",
"Brunei": "文萊",
"Bhutan": "不丹",
"Botswana": "博茨瓦納",
"Central African Rep.": "中非共和國",
"Canada": "加拿大",
"Switzerland": "瑞士",
"Chile": "智利",
"China": "中國",
"Côte d'Ivoire": "科特迪瓦",
"Cameroon": "喀麥隆",
"Dem. Rep. Congo": "剛果(布)",
"Congo": "剛果(金)",
"Colombia": "哥倫比亞",
"Cape Verde": "佛得角",
"Costa Rica": "哥斯達黎加",
"Cuba": "古巴",
"N. Cyprus": "北塞浦路斯",
"Cyprus": "塞浦路斯",
"Czech Rep.": "捷克",
"Germany": "德國",
"Djibouti": "吉布提",
"Denmark": "丹麥",
"Dominican Rep.": "多米尼加",
"Algeria": "阿爾及利亞",
"Ecuador": "厄瓜多爾",
"Egypt": "埃及",
"Eritrea": "厄立特里亞",
"Spain": "西班牙",
"Estonia": "愛沙尼亞",
"Ethiopia": "埃塞俄比亞",
"Finland": "芬蘭",
"Fiji": "斐濟",
"France": "法國",
"Gabon": "加蓬",
"United Kingdom": "英國",
"Georgia": "格魯吉亞",
"Ghana": "加納",
"Guinea": "幾內亞",
"Gambia": "岡比亞",
"Guinea-Bissau": "幾內亞比紹",
"Eq. Guinea": "赤道幾內亞",
"Greece": "希臘",
"Grenada": "格林納達",
"Greenland": "格陵蘭島",
"Guatemala": "危地馬拉",
"Guam": "關島",
"Guyana": "圭亞那合作共和國",
"Honduras": "洪都拉斯",
"Croatia": "克羅地亞",
"Haiti": "海地",
"Hungary": "匈牙利",
"Indonesia": "印度尼西亞",
"India": "印度",
"Br. Indian Ocean Ter.": "英屬印度洋領土",
"Ireland": "愛爾蘭",
"Iran": "伊朗",
"Iraq": "伊拉克",
"Iceland": "冰島",
"Israel": "以色列",
"Italy": "意大利",
"Jamaica": "牙買加",
"Jordan": "約旦",
"Japan": "日本",
"Siachen Glacier": "錫亞琴冰川",
"Kazakhstan": "哈薩克斯坦",
"Kenya": "肯尼亞",
"Kyrgyzstan": "吉爾吉斯斯坦",
"Cambodia": "柬埔寨",
"Korea": "韓國",
"Kuwait": "科威特",
"Lao PDR": "老撾",
"Lebanon": "黎巴嫩",
"Liberia": "利比里亞",
"Libya": "利比亞",
"Sri Lanka": "斯里蘭卡",
"Lesotho": "萊索托",
"Lithuania": "立陶宛",
"Luxembourg": "盧森堡",
"Latvia": "拉脫維亞",
"Moldova": "摩爾多瓦",
"Madagascar": "馬達加斯加",
"Mexico": "墨西哥",
"Macedonia": "馬其頓",
"Mali": "馬裏",
"Malta": "馬耳他",
"Myanmar": "緬甸",
"Montenegro": "黑山",
"Mongolia": "蒙古國",
"Mozambique": "莫桑比克",
"Mauritania": "毛里塔尼亞",
"Mauritius": "毛里求斯",
"Malawi": "馬拉維",
"Malaysia": "馬來西亞",
"Namibia": "納米比亞",
"New Caledonia": "新喀里多尼亞",
"Niger": "尼日爾",
"Nigeria": "尼日利亞",
"Nicaragua": "尼加拉瓜",
"Netherlands": "荷蘭",
"Norway": "挪威",
"Nepal": "尼泊爾",
"New Zealand": "新西蘭",
"Oman": "阿曼",
"Pakistan": "巴基斯坦",
"Panama": "巴拿馬",
"Peru": "祕魯",
"Philippines": "菲律賓",
"Papua New Guinea": "巴布亞新幾內亞",
"Poland": "波蘭",
"Puerto Rico": "波多黎各",
"Dem. Rep. Korea": "朝鮮",
"Portugal": "葡萄牙",
"Paraguay": "巴拉圭",
"Palestine": "巴勒斯坦",
"Qatar": "卡塔爾",
"Romania": "羅馬尼亞",
"Russia": "俄羅斯",
"Rwanda": "盧旺達",
"Saudi Arabia": "沙特阿拉伯",
"Sudan": "蘇丹",
"S. Sudan": "南蘇丹",
"Senegal": "塞內加爾",
"Singapore": "新加坡",
"Solomon Is.": "所羅門羣島",
"Sierra Leone": "塞拉利昂",
"El Salvador": "薩爾瓦多",
"Suriname": "蘇里南",
"Slovakia": "斯洛伐克",
"Slovenia": "斯洛文尼亞",
"Sweden": "瑞典",
"Swaziland": "斯威士蘭",
"Seychelles": "塞舌爾",
"Syria": "敘利亞",
"Chad": "乍得",
"Togo": "多哥",
"Thailand": "泰國",
"Tajikistan": "塔吉克斯坦",
"Turkmenistan": "土庫曼斯坦",
"Timor-Leste": "東帝汶",
"Tonga": "湯加",
"Trinidad and Tobago": "特立尼達和多巴哥",
"Tunisia": "突尼斯",
"Turkey": "土耳其",
"Tanzania": "坦桑尼亞",
"Uganda": "烏干達",
"Ukraine": "烏克蘭",
"Uruguay": "烏拉圭",
"United States": "美國",
"Uzbekistan": "烏茲別克斯坦",
"Venezuela": "委內瑞拉",
"Vietnam": "越南",
"Vanuatu": "瓦努阿圖",
"Yemen": "也門",
"South Africa": "南非",
"Zambia": "贊比亞",
"Zimbabwe": "津巴布韋",
"Aland": "奧蘭羣島",
"American Samoa": "美屬薩摩亞",
"Fr. S. Antarctic Lands": "南極洲",
"Antigua and Barb.": "安提瓜和巴布達",
"Comoros": "科摩羅",
"Curaçao": "庫拉索島",
"Cayman Is.": "開曼羣島",
"Dominica": "多米尼加",
"Falkland Is.": "福克蘭羣島馬爾維納斯",
"Faeroe Is.": "法羅羣島",
"Micronesia": "密克羅尼西亞",
"Heard I. and McDonald Is.": "赫德島和麥克唐納羣島",
"Isle of Man": "曼島",
"Jersey": "澤西島",
"Kiribati": "基里巴斯",
"Saint Lucia": "聖盧西亞",
"N. Mariana Is.": "北马里亞納羣島",
"Montserrat": "蒙特塞拉特",
"Niue": "紐埃",
"Palau": "帕勞",
"Fr. Polynesia": "法屬波利尼西亞",
"S. Geo. and S. Sandw. Is.": "南喬治亞島和南桑威奇羣島",
"Saint Helena": "聖赫勒拿",
"St. Pierre and Miquelon": "聖皮埃爾和密克隆羣島",
"São Tomé and Principe": "聖多美和普林西比",
"Turks and Caicos Is.": "特克斯和凱科斯羣島",
"St. Vin. and Gren.": "聖文森特和格林納丁斯",
"U.S. Virgin Is.": "美屬維爾京羣島",
"Samoa": "薩摩亞"
}
pieces = [
{'max': 0, 'min': 0, 'label': '0', 'color': '#FFFFFF'},
{'max': 49, 'min': 1, 'label': '1-49', 'color': '#FFE5DB'},
{'max': 99, 'min': 50, 'label': '50-99', 'color': '#FFC4B3'},
{'max': 999, 'min': 100, 'label': '100-999', 'color': '#FF9985'},
{'max': 9999, 'min': 1000, 'label': '1000-9999', 'color': '#F57567'},
{'max': 99999, 'min': 10000, 'label': '10000-99999', 'color': '#E64546'},
{'max': 999999, 'min': 100000, 'label': '100000-999999', 'color': '#B80909'},
{'max': 9999999, 'min': 1000000, 'label': '≧1000000', 'color': '#8A0808'}
]
gt_map = (
Map()
.add(series_name='累計確診人數', data_pair=[list(z) for z in zip(country, curconfirm)], maptype="world", name_map=name_map, is_map_symbol_show=False)
.set_series_opts(label_opts=opts.LabelOpts(is_show=False))
.set_global_opts(
title_opts=opts.TitleOpts(title="全球疫情數據(累計確診)",
subtitle='數據更新至:' + time_global + '\n\n來源:百度疫情實時大數據報告'),
visualmap_opts=opts.VisualMapOpts(max_=300, is_piecewise=True, pieces=pieces),
)
)
return gt_map
【7x03】中國每日數據折線圖 china_daily_map()
def china_daily_map():
wb = openpyxl.load_workbook('COVID-19-China.xlsx')
ws_china_confirmed = wb['中國每日累計確診數據']
ws_china_crued = wb['中國每日累計治癒數據']
ws_china_died = wb['中國每日累計死亡數據']
ws_china_confirmed.delete_rows(1)
ws_china_crued.delete_rows(1)
ws_china_died.delete_rows(1)
x_date = [] # 日期
y_china_confirmed = [] # 每日累計確診
y_china_crued = [] # 每日累計治癒
y_china_died = [] # 每日累計死亡
for china_confirmed in ws_china_confirmed.values:
y_china_confirmed.append(china_confirmed[1])
for china_crued in ws_china_crued.values:
x_date.append(china_crued[0])
y_china_crued.append(china_crued[1])
for china_died in ws_china_died.values:
y_china_died.append(china_died[1])
fi_map = (
Line(init_opts=opts.InitOpts(height='420px'))
.add_xaxis(xaxis_data=x_date)
.add_yaxis(
series_name="中國累計確診數據",
y_axis=y_china_confirmed,
label_opts=opts.LabelOpts(is_show=False),
)
.add_yaxis(
series_name="中國累計治癒趨勢",
y_axis=y_china_crued,
label_opts=opts.LabelOpts(is_show=False),
)
.add_yaxis(
series_name="中國累計死亡趨勢",
y_axis=y_china_died,
label_opts=opts.LabelOpts(is_show=False),
)
.set_global_opts(
title_opts=opts.TitleOpts(title="中國每日累計確診/治癒/死亡趨勢"),
legend_opts=opts.LegendOpts(pos_bottom="bottom", orient='horizontal'),
tooltip_opts=opts.TooltipOpts(trigger="axis"),
yaxis_opts=opts.AxisOpts(
type_="value",
axistick_opts=opts.AxisTickOpts(is_show=True),
splitline_opts=opts.SplitLineOpts(is_show=True),
),
xaxis_opts=opts.AxisOpts(type_="category", boundary_gap=False),
)
)
return fi_map
【7x04】境外每日數據折線圖 foreign_daily_map()
def foreign_daily_map():
wb = openpyxl.load_workbook('COVID-19-Global.xlsx')
ws_foreign_confirmed = wb['境外每日累計確診數據']
ws_foreign_crued = wb['境外每日累計治癒數據']
ws_foreign_died = wb['境外每日累計死亡數據']
ws_foreign_confirmed.delete_rows(1)
ws_foreign_crued.delete_rows(1)
ws_foreign_died.delete_rows(1)
x_date = [] # 日期
y_foreign_confirmed = [] # 累計確診
y_foreign_crued = [] # 累計治癒
y_foreign_died = [] # 累計死亡
for foreign_confirmed in ws_foreign_confirmed.values:
y_foreign_confirmed.append(foreign_confirmed[1])
for foreign_crued in ws_foreign_crued.values:
x_date.append(foreign_crued[0])
y_foreign_crued.append(foreign_crued[1])
for foreign_died in ws_foreign_died.values:
y_foreign_died.append(foreign_died[1])
fte_map = (
Line(init_opts=opts.InitOpts(height='420px'))
.add_xaxis(xaxis_data=x_date)
.add_yaxis(
series_name="境外累計確診趨勢",
y_axis=y_foreign_confirmed,
label_opts=opts.LabelOpts(is_show=False),
)
.add_yaxis(
series_name="境外累計治癒趨勢",
y_axis=y_foreign_crued,
label_opts=opts.LabelOpts(is_show=False),
)
.add_yaxis(
series_name="境外累計死亡趨勢",
y_axis=y_foreign_died,
label_opts=opts.LabelOpts(is_show=False),
)
.set_global_opts(
title_opts=opts.TitleOpts(title="境外每日累計確診/治癒/死亡趨勢"),
legend_opts=opts.LegendOpts(pos_bottom="bottom", orient='horizontal'),
tooltip_opts=opts.TooltipOpts(trigger="axis"),
yaxis_opts=opts.AxisOpts(
type_="value",
axistick_opts=opts.AxisTickOpts(is_show=True),
splitline_opts=opts.SplitLineOpts(is_show=True),
),
xaxis_opts=opts.AxisOpts(type_="category", boundary_gap=False),
)
)
return fte_map
【8x00】結果截圖
【8x01】數據儲存 Excel
【8x02】詞雲圖
【8x03】地圖 + 折線圖
【9x00】完整代碼
完整代碼地址(點亮 star 有 buff 加成):https://github.com/TRHX/Python3-Spider-Practice/tree/master/COVID-19
其他爬蟲實戰代碼合集(持續更新):https://github.com/TRHX/Python3-Spider-Practice
爬蟲實戰專欄(持續更新):https://itrhx.blog.csdn.net/article/category/9351278
這裏是一段防爬蟲文本,請讀者忽略。
本文原創首發於 CSDN,作者 TRHX。
博客首頁:https://itrhx.blog.csdn.net/
本文鏈接:https://itrhx.blog.csdn.net/article/details/107140534
未經授權,禁止轉載!惡意轉載,後果自負!尊重原創,遠離剽竊!