COVID-19 肺炎疫情數據實時監控(python 爬蟲 + pyecharts 數據可視化 + wordcloud 詞雲圖)



這裏是一段防爬蟲文本,請讀者忽略。
本文原創首發於 CSDN,作者 TRHX。
博客首頁:https://itrhx.blog.csdn.net/
本文鏈接:https://itrhx.blog.csdn.net/article/details/107140534
未經授權,禁止轉載!惡意轉載,後果自負!尊重原創,遠離剽竊!

【1x00】前言

本來兩三個月之前就想搞個疫情數據實時數據展示的,由於各種不可抗拒因素一而再再而三的鴿了,最近終於抽空寫了一個,數據是用 Python 爬取的百度疫情實時大數據報告,請求庫用的 requests,解析用的 Xpath 語法,詞雲用的 wordcloud 庫,數據可視化用 pyecharts 繪製的地圖和折線圖,數據儲存在 Excel 表格裏面,使用 openpyxl 對錶格進行處理。

本程序實現了累計確診地圖展示和每日數據變化折線圖展示,其他更多數據的獲取和展示均可在程序中進行拓展,可以將程序部署在服務器上,設置定時運行,即可實時展示數據,pyecharts 繪圖模塊也可以整合到 Web 框架(Django、Flask等)中使用。

注意項:在獲取數據時有全球境外兩個概念,全球包含中國,境外不包含中國,後期繪製的四個圖:中國累計確診地圖、全球累計確診地圖(包含中國)、中國每日數據折線圖、境外每日數據折線圖(不包含中國)。

【2x00】思維導圖

01

【3x00】數據結構分析

通過查看百度的疫情數據頁面,可以看到很多整齊的數據,猜測就是疫情相關的數據,保存該頁面,對其進行格式化,很容易可以分析出所有的數據都在 <script type="application/json" id="captain-config"></script> 裏面,其中 title 裏面是一些 Unicode 編碼,將其轉爲中文後更容易得到不同的分類數據。

02

由於數據繁多,可以將數據主體部分提取出來,刪除一些重複項和其他雜項,留下數據大體位置並分析數據結構,便於後期的數據提取,經過處理後的數據大致結構如下:

<script type="application/json" id="captain-config">
    {
        "component": [
            {
                "mapLastUpdatedTime": "2020.07.05 16:13",        // 國內疫情數據最後更新時間
                "caseList": [                                    // caseList 列表,每一個元素是一個字典
                    {
                        "confirmed": "1",                        // 每個字典包含中國每個省的每一項疫情數據
                        "died": "0",
                        "crued": "1",
                        "relativeTime": "1593792000",
                        "confirmedRelative": "0",
                        "diedRelative": "0",
                        "curedRelative": "0",
                        "curConfirm": "0",
                        "curConfirmRelative": "0",
                        "icuDisable": "1",
                        "area": "西藏",
                        "subList": [                            // subList 列表,每一個元素是一個字典
                            {
                                "city": "拉薩",                 // 每個字典包含該省份對應的每個城市疫情數據
                                "confirmed": "1",
                                "died": "0",
                                "crued": "1",
                                "confirmedRelative": "0",
                                "curConfirm": "0",
                                "cityCode": "100"
                            }
                        ]
                    }
                ],
                "caseOutsideList": [                           // caseOutsideList 列表,每一個元素是一個字典
                    {
                        "confirmed": "241419",                 // 每個字典包含各國的每一項疫情數據
                        "died": "34854",
                        "crued": "191944",
                        "relativeTime": "1593792000",
                        "confirmedRelative": "223",
                        "curConfirm": "14621",
                        "icuDisable": "1",
                        "area": "意大利",
                        "subList": [                          // subList 列表,每一個元素是一個字典
                            {
                                "city": "倫巴第",              // 每個字典包含每個國家對應的每個城市疫情數據
                                "confirmed": "94318",
                                "died": "16691",
                                "crued": "68201",
                                "curConfirm": "9426"
                            }
                        ]
                    }
                ],
                "summaryDataIn": {                           // summaryDataIn 國內總的疫情數據
                    "confirmed": "85307",
                    "died": "4648",
                    "cured": "80144",
                    "asymptomatic": "99",
                    "asymptomaticRelative": "7",
                    "unconfirmed": "7",
                    "relativeTime": "1593792000",
                    "confirmedRelative": "19",
                    "unconfirmedRelative": "1",
                    "curedRelative": "27",
                    "diedRelative": "0",
                    "icu": "6",
                    "icuRelative": "0",
                    "overseasInput": "1931",
                    "unOverseasInputCumulative": "83375",
                    "overseasInputRelative": "6",
                    "unOverseasInputNewAdd": "13",
                    "curConfirm": "515",
                    "curConfirmRelative": "-8",
                    "icuDisable": "1"
                },
                "summaryDataOut": {                           // summaryDataOut 國外總的疫情數據
                    "confirmed": "11302569",
                    "died": "528977",
                    "curConfirm": "4410601",
                    "cured": "6362991",
                    "confirmedRelative": "206165",
                    "curedRelative": "190018",
                    "diedRelative": "4876",
                    "curConfirmRelative": "11271",
                    "relativeTime": "1593792000"
                },
                "trend": {                                    // trend 字典,包含國內每日的疫情數據
                    "updateDate": [],                         // 日期
                    "list": [                                 // list 列表,每項數據及其對應的值
                        {
                            "name": "確診",
                            "data": []
                        },
                        {
                            "name": "疑似",
                            "data": []
                        },
                        {
                            "name": "治癒",
                            "data": []
                        },
                        {
                            "name": "死亡",
                            "data": []
                        },
                        {
                            "name": "新增確診",
                            "data": []
                        },
                        {
                            "name": "新增疑似",
                            "data": []
                        },
                        {
                            "name": "新增治癒",
                            "data": []
                        },
                        {
                            "name": "新增死亡",
                            "data": []
                        },
                        {
                            "name": "累計境外輸入",
                            "data": []
                        },
                        {
                            "name": "新增境外輸入",
                            "data": []
                        }
                    ]
                },
                "foreignLastUpdatedTime": "2020.07.05 16:13",       // 國外疫情數據最後更新時間
                "globalList": [                                     // globalList 列表,每一個元素是一個字典
                    {
                        "area": "亞洲",                              // 按照不同洲進行分類
                        "subList": [                                // subList 列表,每個洲各個國家的疫情數據
                            {
                                "died": "52",
                                "confirmed": "6159",
                                "crued": "4809",
                                "curConfirm": "1298",
                                "confirmedRelative": "0",
                                "relativeTime": "1593792000",
                                "country": "塔吉克斯坦"
                            }
                        ],
                        "died": "56556",                            // 每個洲總的疫情數據
                        "crued": "1625562",
                        "confirmed": "2447873",
                        "curConfirm": "765755",
                        "confirmedRelative": "60574"
                    },
                    {
                        "area": "其他",                             // 其他特殊區域疫情數據
                        "subList": [
                            {
                                "died": "13",
                                "confirmed": "712",
                                "crued": "651",
                                "curConfirm": "48",
                                "confirmedRelative": "0",
                                "relativeTime": "1593792000",
                                "country": "鑽石公主號郵輪"
                            }
                        ],
                        "died": "13",                              // 其他特殊區域疫情總的數據
                        "crued": "651",
                        "confirmed": "712",
                        "curConfirm": "48",
                        "confirmedRelative": "0"
                    },
                    {
                        "area": "熱門",                            // 熱門國家疫情數據
                        "subList": [
                            {
                                "died": "5206",
                                "confirmed": "204610",
                                "crued": "179492",
                                "curConfirm": "19912",
                                "confirmedRelative": "1172",
                                "relativeTime": "1593792000",
                                "country": "土耳其"
                            }
                        ],
                        "died": "528967",                         // 熱門國家疫情總的數據
                        "crued": "6362924",
                        "confirmed": "11302357",
                        "confirmedRelative": "216478",
                        "curConfirm": "4410466"
                    }],
                "allForeignTrend": {                            // allForeignTrend 字典,包含國外每日的疫情數據
                        "updateDate": [],                       // 日期
                        "list": [                               // list 列表,每項數據及其對應的值
                            {
                                "name": "累計確診",
                                "data": []
                            },
                            {
                                "name": "治癒",
                                "data": []
                            },
                            {
                                "name": "死亡",
                                "data": []
                            },
                            {
                                "name": "現有確診",
                                "data": []
                            },
                            {
                                "name": "新增確診",
                                "data": []
                            }
                        ]
                    },
                "topAddCountry": [                    // 確診增量最高的國家
                        {
                            "name": "美國",
                            "value": 53162
                        }
                    ],
                "topOverseasInput": [                // 境外輸入最多的省份
                    {
                        "name": "黑龍江",
                        "value": 386
                    }
                ]
            }
        ]
    }
</script>

【4x00】主函數 main()

分別將數據獲取、詞雲圖繪製、地圖繪製寫入三個文件:data_get()data_wordcloud()data_map(),然後使用一個主函數文件 main.py 來調用這三個文件裏面的函數。

import data_get
import data_wordcloud
import data_map

data_dict = data_get.init()
data_get.china_total_data(data_dict)
data_get.global_total_data(data_dict)
data_get.china_daily_data(data_dict)
data_get.foreign_daily_data(data_dict)

data_wordcloud.china_wordcloud()
data_wordcloud.global_wordcloud()

data_map.all_map()

【5x00】數據獲取模塊 data_get

【5x01】初始化函數 init()

使用 xpath 語法 //script[@id="captain-config"]/text() 提取裏面的值,利用 json.loads 方法將其轉換爲字典對象,以便後續的其他函數調用。

def init():
    headers = {
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.13 Safari/537.36'
    }
    url = 'https://voice.baidu.com/act/newpneumonia/newpneumonia/'
    response = requests.get(url=url, headers=headers)
    tree = etree.HTML(response.text)
    dict1 = tree.xpath('//script[@id="captain-config"]/text()')
    print(type(dict1[0]))
    dict2 = json.loads(dict1[0])
    return dict2

【5x02】中國總數據 china_total_data()

def china_total_data(data):

    """
    1、中國省/直轄市/自治區/行政區疫情數據
    省/直轄市/自治區/行政區:area
    現有確診:    curConfirm
    累計確診:    confirmed
    累計治癒:    crued
    累計死亡:    died
    現有確診增量: curConfirmRelative
    累計確診增量: confirmedRelative
    累計治癒增量: curedRelative
    累計死亡增量: diedRelative
    """

    wb = openpyxl.Workbook()            # 創建工作簿
    ws_china = wb.active                # 獲取工作表
    ws_china.title = "中國省份疫情數據"   # 命名工作表
    ws_china.append(['省/直轄市/自治區/行政區', '現有確診', '累計確診', '累計治癒',
                     '累計死亡', '現有確診增量', '累計確診增量',
                     '累計治癒增量', '累計死亡增量'])
    china = data['component'][0]['caseList']
    for province in china:
        ws_china.append([province['area'],
                        province['curConfirm'],
                        province['confirmed'],
                        province['crued'],
                        province['died'],
                        province['curConfirmRelative'],
                        province['confirmedRelative'],
                        province['curedRelative'],
                        province['diedRelative']])

    """
    2、中國城市疫情數據
    城市:city
    現有確診:curConfirm
    累計確診:confirmed
    累計治癒:crued
    累計死亡:died
    累計確診增量:confirmedRelative
    """

    ws_city = wb.create_sheet('中國城市疫情數據')
    ws_city.append(['城市', '現有確診', '累計確診',
                    '累計治癒', '累計死亡', '累計確診增量'])
    for province in china:
        for city in province['subList']:
            # 某些城市沒有 curConfirm 數據,則將其設置爲 0,crued 和 died 爲空時,替換成 0
            if 'curConfirm' not in city:
                city['curConfirm'] = '0'
            if city['crued'] == '':
                city['crued'] = '0'
            if city['died'] == '':
                city['died'] = '0'
            ws_city.append([city['city'], '0', city['confirmed'],
                           city['crued'], city['died'], city['confirmedRelative']])

    """
    3、中國疫情數據更新時間:mapLastUpdatedTime
    """

    time_domestic = data['component'][0]['mapLastUpdatedTime']
    ws_time = wb.create_sheet('中國疫情數據更新時間')
    ws_time.column_dimensions['A'].width = 22  # 調整列寬
    ws_time.append(['中國疫情數據更新時間'])
    ws_time.append([time_domestic])

    wb.save('COVID-19-China.xlsx')
    print('中國疫情數據已保存至 COVID-19-China.xlsx!')

【5x03】全球總數據 global_total_data()

全球總數據在提取完成後,進行地圖繪製時發現並沒有中國的數據,因此在寫入全球數據時注意要單獨將中國的數據插入 Excel 中。

def global_total_data(data):

    """
    1、全球各國疫情數據
    國家:country
    現有確診:curConfirm
    累計確診:confirmed
    累計治癒:crued
    累計死亡:died
    累計確診增量:confirmedRelative
    """

    wb = openpyxl.Workbook()
    ws_global = wb.active
    ws_global.title = "全球各國疫情數據"

    # 按照國家保存數據
    countries = data['component'][0]['caseOutsideList']
    ws_global.append(['國家', '現有確診', '累計確診', '累計治癒', '累計死亡', '累計確診增量'])
    for country in countries:
        ws_global.append([country['area'],
                          country['curConfirm'],
                          country['confirmed'],
                          country['crued'],
                          country['died'],
                          country['confirmedRelative']])

    # 按照洲保存數據
    continent = data['component'][0]['globalList']
    for area in continent:
        ws_foreign = wb.create_sheet(area['area'] + '疫情數據')
        ws_foreign.append(['國家', '現有確診', '累計確診', '累計治癒', '累計死亡', '累計確診增量'])
        for country in area['subList']:
            ws_foreign.append([country['country'],
                               country['curConfirm'],
                               country['confirmed'],
                               country['crued'],
                               country['died'],
                               country['confirmedRelative']])

    # 在“全球各國疫情數據”和“亞洲疫情數據”兩張表中寫入中國疫情數據
    ws1, ws2 = wb['全球各國疫情數據'], wb['亞洲疫情數據']
    original_data = data['component'][0]['summaryDataIn']
    add_china_data = ['中國',
                      original_data['curConfirm'],
                      original_data['confirmed'],
                      original_data['cured'],
                      original_data['died'],
                      original_data['confirmedRelative']]
    ws1.append(add_china_data)
    ws2.append(add_china_data)

    """
    2、全球疫情數據更新時間:foreignLastUpdatedTime
    """

    time_foreign = data['component'][0]['foreignLastUpdatedTime']
    ws_time = wb.create_sheet('全球疫情數據更新時間')
    ws_time.column_dimensions['A'].width = 22  # 調整列寬
    ws_time.append(['全球疫情數據更新時間'])
    ws_time.append([time_foreign])

    wb.save('COVID-19-Global.xlsx')
    print('全球疫情數據已保存至 COVID-19-Global.xlsx!')

【5x04】中國每日數據 china_daily_data()

def china_daily_data(data):

    """
    i_dict = data['component'][0]['trend']
    i_dict['updateDate']:日期
    i_dict['list'][0]:確診
    i_dict['list'][1]:疑似
    i_dict['list'][2]:治癒
    i_dict['list'][3]:死亡
    i_dict['list'][4]:新增確診
    i_dict['list'][5]:新增疑似
    i_dict['list'][6]:新增治癒
    i_dict['list'][7]:新增死亡
    i_dict['list'][8]:累計境外輸入
    i_dict['list'][9]:新增境外輸入
    """

    ccd_dict = data['component'][0]['trend']
    update_date = ccd_dict['updateDate']              # 日期
    china_confirmed = ccd_dict['list'][0]['data']     # 每日累計確診數據
    china_crued = ccd_dict['list'][2]['data']         # 每日累計治癒數據
    china_died = ccd_dict['list'][3]['data']          # 每日累計死亡數據
    wb = openpyxl.load_workbook('COVID-19-China.xlsx')

    # 寫入每日累計確診數據
    ws_china_confirmed = wb.create_sheet('中國每日累計確診數據')
    ws_china_confirmed.append(['日期', '數據'])
    for data in zip(update_date, china_confirmed):
        ws_china_confirmed.append(data)

    # 寫入每日累計治癒數據
    ws_china_crued = wb.create_sheet('中國每日累計治癒數據')
    ws_china_crued.append(['日期', '數據'])
    for data in zip(update_date, china_crued):
        ws_china_crued.append(data)

    # 寫入每日累計死亡數據
    ws_china_died = wb.create_sheet('中國每日累計死亡數據')
    ws_china_died.append(['日期', '數據'])
    for data in zip(update_date, china_died):
        ws_china_died.append(data)

    wb.save('COVID-19-China.xlsx')
    print('中國每日累計確診/治癒/死亡數據已保存至 COVID-19-China.xlsx!')

【5x05】境外每日數據 foreign_daily_data()

def foreign_daily_data(data):

    """
    te_dict = data['component'][0]['allForeignTrend']
    te_dict['updateDate']:日期
    te_dict['list'][0]:累計確診
    te_dict['list'][1]:治癒
    te_dict['list'][2]:死亡
    te_dict['list'][3]:現有確診
    te_dict['list'][4]:新增確診
    """

    te_dict = data['component'][0]['allForeignTrend']
    update_date = te_dict['updateDate']                # 日期
    foreign_confirmed = te_dict['list'][0]['data']     # 每日累計確診數據
    foreign_crued = te_dict['list'][1]['data']         # 每日累計治癒數據
    foreign_died = te_dict['list'][2]['data']          # 每日累計死亡數據
    wb = openpyxl.load_workbook('COVID-19-Global.xlsx')

    # 寫入每日累計確診數據
    ws_foreign_confirmed = wb.create_sheet('境外每日累計確診數據')
    ws_foreign_confirmed.append(['日期', '數據'])
    for data in zip(update_date, foreign_confirmed):
        ws_foreign_confirmed.append(data)

    # 寫入累計治癒數據
    ws_foreign_crued = wb.create_sheet('境外每日累計治癒數據')
    ws_foreign_crued.append(['日期', '數據'])
    for data in zip(update_date, foreign_crued):
        ws_foreign_crued.append(data)

    # 寫入累計死亡數據
    ws_foreign_died = wb.create_sheet('境外每日累計死亡數據')
    ws_foreign_died.append(['日期', '數據'])
    for data in zip(update_date, foreign_died):
        ws_foreign_died.append(data)

    wb.save('COVID-19-Global.xlsx')
    print('境外每日累計確診/治癒/死亡數據已保存至 COVID-19-Global.xlsx!')

【6x00】詞雲圖繪製模塊 data_wordcloud

【6x01】中國累計確診詞雲圖 foreign_daily_data()

def china_wordcloud():
    wb = openpyxl.load_workbook('COVID-19-China.xlsx')  # 獲取已有的xlsx文件
    ws_china = wb['中國省份疫情數據']                     # 獲取中國省份疫情數據表
    ws_china.delete_rows(1)                             # 刪除第一行
    china_dict = {}                                     # 將省份及其累計確診按照鍵值對形式儲存在字典中
    for data in ws_china.values:
        china_dict[data[0]] = int(data[2])
    word_cloud = wordcloud.WordCloud(font_path='C:/Windows/Fonts/simsun.ttc',
                                     background_color='#CDC9C9',
                                     min_font_size=15,
                                     width=900, height=500)
    word_cloud.generate_from_frequencies(china_dict)
    word_cloud.to_file('WordCloud-China.png')
    print('中國省份疫情詞雲圖繪製完畢!')

【6x02】全球累計確診詞雲圖 foreign_daily_data()

def global_wordcloud():
    wb = openpyxl.load_workbook('COVID-19-Global.xlsx')
    ws_global = wb['全球各國疫情數據']
    ws_global.delete_rows(1)
    global_dict = {}
    for data in ws_global.values:
        global_dict[data[0]] = int(data[2])
    word_cloud = wordcloud.WordCloud(font_path='C:/Windows/Fonts/simsun.ttc',
                                     background_color='#CDC9C9',
                                     width=900, height=500)
    word_cloud.generate_from_frequencies(global_dict)
    word_cloud.to_file('WordCloud-Global.png')
    print('全球各國疫情詞雲圖繪製完畢!')

這裏是一段防爬蟲文本,請讀者忽略。
本文原創首發於 CSDN,作者 TRHX。
博客首頁:https://itrhx.blog.csdn.net/
本文鏈接:https://itrhx.blog.csdn.net/article/details/107140534
未經授權,禁止轉載!惡意轉載,後果自負!尊重原創,遠離剽竊!

【7x00】地圖繪製模塊 data_map

【7x01】中國累計確診地圖 china_total_map()

def china_total_map():
    wb = openpyxl.load_workbook('COVID-19-China.xlsx')  # 獲取已有的xlsx文件
    ws_time = wb['中國疫情數據更新時間']                   # 獲取文件中中國疫情數據更新時間表
    ws_data = wb['中國省份疫情數據']                      # 獲取文件中中國省份疫情數據表
    ws_data.delete_rows(1)                              # 刪除第一行
    province = []                                       # 省份
    curconfirm = []                                     # 累計確診
    for data in ws_data.values:
        province.append(data[0])
        curconfirm.append(data[2])
    time_china = ws_time['A2'].value                    # 更新時間

    # 設置分級顏色
    pieces = [
        {'max': 0, 'min': 0, 'label': '0', 'color': '#FFFFFF'},
        {'max': 9, 'min': 1, 'label': '1-9', 'color': '#FFE5DB'},
        {'max': 99, 'min': 10, 'label': '10-99', 'color': '#FF9985'},
        {'max': 999, 'min': 100, 'label': '100-999', 'color': '#F57567'},
        {'max': 9999, 'min': 1000, 'label': '1000-9999', 'color': '#E64546'},
        {'max': 99999, 'min': 10000, 'label': '≧10000', 'color': '#B80909'}
    ]

    # 繪製地圖
    ct_map = (
        Map()
        .add(series_name='累計確診人數', data_pair=[list(z) for z in zip(province, curconfirm)], maptype="china")
        .set_global_opts(
            title_opts=opts.TitleOpts(title="中國疫情數據(累計確診)",
                                      subtitle='數據更新至:' + time_china + '\n\n來源:百度疫情實時大數據報告'),
            visualmap_opts=opts.VisualMapOpts(max_=300, is_piecewise=True, pieces=pieces)
        )
    )
    return ct_map

【7x02】全球累計確診地圖 global_total_map()

def global_total_map():
    wb = openpyxl.load_workbook('COVID-19-Global.xlsx')
    ws_time = wb['全球疫情數據更新時間']
    ws_data = wb['全球各國疫情數據']
    ws_data.delete_rows(1)
    country = []                        # 國家
    curconfirm = []                     # 累計確診
    for data in ws_data.values:
        country.append(data[0])
        curconfirm.append(data[2])
    time_global = ws_time['A2'].value   # 更新時間

    # 國家名稱中英文映射表
    name_map = {
          "Somalia": "索馬里",
          "Liechtenstein": "列支敦士登",
          "Morocco": "摩洛哥",
          "W. Sahara": "西撒哈拉",
          "Serbia": "塞爾維亞",
          "Afghanistan": "阿富汗",
          "Angola": "安哥拉",
          "Albania": "阿爾巴尼亞",
          "Andorra": "安道爾共和國",
          "United Arab Emirates": "阿拉伯聯合酋長國",
          "Argentina": "阿根廷",
          "Armenia": "亞美尼亞",
          "Australia": "澳大利亞",
          "Austria": "奧地利",
          "Azerbaijan": "阿塞拜疆",
          "Burundi": "布隆迪",
          "Belgium": "比利時",
          "Benin": "貝寧",
          "Burkina Faso": "布基納法索",
          "Bangladesh": "孟加拉國",
          "Bulgaria": "保加利亞",
          "Bahrain": "巴林",
          "Bahamas": "巴哈馬",
          "Bosnia and Herz.": "波斯尼亞和黑塞哥維那",
          "Belarus": "白俄羅斯",
          "Belize": "伯利茲",
          "Bermuda": "百慕大",
          "Bolivia": "玻利維亞",
          "Brazil": "巴西",
          "Barbados": "巴巴多斯",
          "Brunei": "文萊",
          "Bhutan": "不丹",
          "Botswana": "博茨瓦納",
          "Central African Rep.": "中非共和國",
          "Canada": "加拿大",
          "Switzerland": "瑞士",
          "Chile": "智利",
          "China": "中國",
          "Côte d'Ivoire": "科特迪瓦",
          "Cameroon": "喀麥隆",
          "Dem. Rep. Congo": "剛果(布)",
          "Congo": "剛果(金)",
          "Colombia": "哥倫比亞",
          "Cape Verde": "佛得角",
          "Costa Rica": "哥斯達黎加",
          "Cuba": "古巴",
          "N. Cyprus": "北塞浦路斯",
          "Cyprus": "塞浦路斯",
          "Czech Rep.": "捷克",
          "Germany": "德國",
          "Djibouti": "吉布提",
          "Denmark": "丹麥",
          "Dominican Rep.": "多米尼加",
          "Algeria": "阿爾及利亞",
          "Ecuador": "厄瓜多爾",
          "Egypt": "埃及",
          "Eritrea": "厄立特里亞",
          "Spain": "西班牙",
          "Estonia": "愛沙尼亞",
          "Ethiopia": "埃塞俄比亞",
          "Finland": "芬蘭",
          "Fiji": "斐濟",
          "France": "法國",
          "Gabon": "加蓬",
          "United Kingdom": "英國",
          "Georgia": "格魯吉亞",
          "Ghana": "加納",
          "Guinea": "幾內亞",
          "Gambia": "岡比亞",
          "Guinea-Bissau": "幾內亞比紹",
          "Eq. Guinea": "赤道幾內亞",
          "Greece": "希臘",
          "Grenada": "格林納達",
          "Greenland": "格陵蘭島",
          "Guatemala": "危地馬拉",
          "Guam": "關島",
          "Guyana": "圭亞那合作共和國",
          "Honduras": "洪都拉斯",
          "Croatia": "克羅地亞",
          "Haiti": "海地",
          "Hungary": "匈牙利",
          "Indonesia": "印度尼西亞",
          "India": "印度",
          "Br. Indian Ocean Ter.": "英屬印度洋領土",
          "Ireland": "愛爾蘭",
          "Iran": "伊朗",
          "Iraq": "伊拉克",
          "Iceland": "冰島",
          "Israel": "以色列",
          "Italy": "意大利",
          "Jamaica": "牙買加",
          "Jordan": "約旦",
          "Japan": "日本",
          "Siachen Glacier": "錫亞琴冰川",
          "Kazakhstan": "哈薩克斯坦",
          "Kenya": "肯尼亞",
          "Kyrgyzstan": "吉爾吉斯斯坦",
          "Cambodia": "柬埔寨",
          "Korea": "韓國",
          "Kuwait": "科威特",
          "Lao PDR": "老撾",
          "Lebanon": "黎巴嫩",
          "Liberia": "利比里亞",
          "Libya": "利比亞",
          "Sri Lanka": "斯里蘭卡",
          "Lesotho": "萊索托",
          "Lithuania": "立陶宛",
          "Luxembourg": "盧森堡",
          "Latvia": "拉脫維亞",
          "Moldova": "摩爾多瓦",
          "Madagascar": "馬達加斯加",
          "Mexico": "墨西哥",
          "Macedonia": "馬其頓",
          "Mali": "馬裏",
          "Malta": "馬耳他",
          "Myanmar": "緬甸",
          "Montenegro": "黑山",
          "Mongolia": "蒙古國",
          "Mozambique": "莫桑比克",
          "Mauritania": "毛里塔尼亞",
          "Mauritius": "毛里求斯",
          "Malawi": "馬拉維",
          "Malaysia": "馬來西亞",
          "Namibia": "納米比亞",
          "New Caledonia": "新喀里多尼亞",
          "Niger": "尼日爾",
          "Nigeria": "尼日利亞",
          "Nicaragua": "尼加拉瓜",
          "Netherlands": "荷蘭",
          "Norway": "挪威",
          "Nepal": "尼泊爾",
          "New Zealand": "新西蘭",
          "Oman": "阿曼",
          "Pakistan": "巴基斯坦",
          "Panama": "巴拿馬",
          "Peru": "祕魯",
          "Philippines": "菲律賓",
          "Papua New Guinea": "巴布亞新幾內亞",
          "Poland": "波蘭",
          "Puerto Rico": "波多黎各",
          "Dem. Rep. Korea": "朝鮮",
          "Portugal": "葡萄牙",
          "Paraguay": "巴拉圭",
          "Palestine": "巴勒斯坦",
          "Qatar": "卡塔爾",
          "Romania": "羅馬尼亞",
          "Russia": "俄羅斯",
          "Rwanda": "盧旺達",
          "Saudi Arabia": "沙特阿拉伯",
          "Sudan": "蘇丹",
          "S. Sudan": "南蘇丹",
          "Senegal": "塞內加爾",
          "Singapore": "新加坡",
          "Solomon Is.": "所羅門羣島",
          "Sierra Leone": "塞拉利昂",
          "El Salvador": "薩爾瓦多",
          "Suriname": "蘇里南",
          "Slovakia": "斯洛伐克",
          "Slovenia": "斯洛文尼亞",
          "Sweden": "瑞典",
          "Swaziland": "斯威士蘭",
          "Seychelles": "塞舌爾",
          "Syria": "敘利亞",
          "Chad": "乍得",
          "Togo": "多哥",
          "Thailand": "泰國",
          "Tajikistan": "塔吉克斯坦",
          "Turkmenistan": "土庫曼斯坦",
          "Timor-Leste": "東帝汶",
          "Tonga": "湯加",
          "Trinidad and Tobago": "特立尼達和多巴哥",
          "Tunisia": "突尼斯",
          "Turkey": "土耳其",
          "Tanzania": "坦桑尼亞",
          "Uganda": "烏干達",
          "Ukraine": "烏克蘭",
          "Uruguay": "烏拉圭",
          "United States": "美國",
          "Uzbekistan": "烏茲別克斯坦",
          "Venezuela": "委內瑞拉",
          "Vietnam": "越南",
          "Vanuatu": "瓦努阿圖",
          "Yemen": "也門",
          "South Africa": "南非",
          "Zambia": "贊比亞",
          "Zimbabwe": "津巴布韋",
          "Aland": "奧蘭羣島",
          "American Samoa": "美屬薩摩亞",
          "Fr. S. Antarctic Lands": "南極洲",
          "Antigua and Barb.": "安提瓜和巴布達",
          "Comoros": "科摩羅",
          "Curaçao": "庫拉索島",
          "Cayman Is.": "開曼羣島",
          "Dominica": "多米尼加",
          "Falkland Is.": "福克蘭羣島馬爾維納斯",
          "Faeroe Is.": "法羅羣島",
          "Micronesia": "密克羅尼西亞",
          "Heard I. and McDonald Is.": "赫德島和麥克唐納羣島",
          "Isle of Man": "曼島",
          "Jersey": "澤西島",
          "Kiribati": "基里巴斯",
          "Saint Lucia": "聖盧西亞",
          "N. Mariana Is.": "北马里亞納羣島",
          "Montserrat": "蒙特塞拉特",
          "Niue": "紐埃",
          "Palau": "帕勞",
          "Fr. Polynesia": "法屬波利尼西亞",
          "S. Geo. and S. Sandw. Is.": "南喬治亞島和南桑威奇羣島",
          "Saint Helena": "聖赫勒拿",
          "St. Pierre and Miquelon": "聖皮埃爾和密克隆羣島",
          "São Tomé and Principe": "聖多美和普林西比",
          "Turks and Caicos Is.": "特克斯和凱科斯羣島",
          "St. Vin. and Gren.": "聖文森特和格林納丁斯",
          "U.S. Virgin Is.": "美屬維爾京羣島",
          "Samoa": "薩摩亞"
        }

    pieces = [
        {'max': 0, 'min': 0, 'label': '0', 'color': '#FFFFFF'},
        {'max': 49, 'min': 1, 'label': '1-49', 'color': '#FFE5DB'},
        {'max': 99, 'min': 50, 'label': '50-99', 'color': '#FFC4B3'},
        {'max': 999, 'min': 100, 'label': '100-999', 'color': '#FF9985'},
        {'max': 9999, 'min': 1000, 'label': '1000-9999', 'color': '#F57567'},
        {'max': 99999, 'min': 10000, 'label': '10000-99999', 'color': '#E64546'},
        {'max': 999999, 'min': 100000, 'label': '100000-999999', 'color': '#B80909'},
        {'max': 9999999, 'min': 1000000, 'label': '≧1000000', 'color': '#8A0808'}
    ]

    gt_map = (
        Map()
        .add(series_name='累計確診人數', data_pair=[list(z) for z in zip(country, curconfirm)], maptype="world", name_map=name_map, is_map_symbol_show=False)
        .set_series_opts(label_opts=opts.LabelOpts(is_show=False))
        .set_global_opts(
            title_opts=opts.TitleOpts(title="全球疫情數據(累計確診)",
                                      subtitle='數據更新至:' + time_global + '\n\n來源:百度疫情實時大數據報告'),
            visualmap_opts=opts.VisualMapOpts(max_=300, is_piecewise=True, pieces=pieces),
        )
    )
    return gt_map

【7x03】中國每日數據折線圖 china_daily_map()

def china_daily_map():
    wb = openpyxl.load_workbook('COVID-19-China.xlsx')
    ws_china_confirmed = wb['中國每日累計確診數據']
    ws_china_crued = wb['中國每日累計治癒數據']
    ws_china_died = wb['中國每日累計死亡數據']

    ws_china_confirmed.delete_rows(1)
    ws_china_crued.delete_rows(1)
    ws_china_died.delete_rows(1)

    x_date = []               # 日期
    y_china_confirmed = []    # 每日累計確診
    y_china_crued = []        # 每日累計治癒
    y_china_died = []         # 每日累計死亡

    for china_confirmed in ws_china_confirmed.values:
        y_china_confirmed.append(china_confirmed[1])
    for china_crued in ws_china_crued.values:
        x_date.append(china_crued[0])
        y_china_crued.append(china_crued[1])
    for china_died in ws_china_died.values:
        y_china_died.append(china_died[1])

    fi_map = (
        Line(init_opts=opts.InitOpts(height='420px'))
            .add_xaxis(xaxis_data=x_date)
            .add_yaxis(
            series_name="中國累計確診數據",
            y_axis=y_china_confirmed,
            label_opts=opts.LabelOpts(is_show=False),
        )
            .add_yaxis(
            series_name="中國累計治癒趨勢",
            y_axis=y_china_crued,
            label_opts=opts.LabelOpts(is_show=False),
        )
            .add_yaxis(
            series_name="中國累計死亡趨勢",
            y_axis=y_china_died,
            label_opts=opts.LabelOpts(is_show=False),
        )
            .set_global_opts(
            title_opts=opts.TitleOpts(title="中國每日累計確診/治癒/死亡趨勢"),
            legend_opts=opts.LegendOpts(pos_bottom="bottom", orient='horizontal'),
            tooltip_opts=opts.TooltipOpts(trigger="axis"),
            yaxis_opts=opts.AxisOpts(
                type_="value",
                axistick_opts=opts.AxisTickOpts(is_show=True),
                splitline_opts=opts.SplitLineOpts(is_show=True),
            ),
            xaxis_opts=opts.AxisOpts(type_="category", boundary_gap=False),
        )
    )
    return fi_map

【7x04】境外每日數據折線圖 foreign_daily_map()

def foreign_daily_map():
    wb = openpyxl.load_workbook('COVID-19-Global.xlsx')
    ws_foreign_confirmed = wb['境外每日累計確診數據']
    ws_foreign_crued = wb['境外每日累計治癒數據']
    ws_foreign_died = wb['境外每日累計死亡數據']

    ws_foreign_confirmed.delete_rows(1)
    ws_foreign_crued.delete_rows(1)
    ws_foreign_died.delete_rows(1)

    x_date = []                # 日期
    y_foreign_confirmed = []   # 累計確診
    y_foreign_crued = []       # 累計治癒
    y_foreign_died = []        # 累計死亡

    for foreign_confirmed in ws_foreign_confirmed.values:
        y_foreign_confirmed.append(foreign_confirmed[1])
    for foreign_crued in ws_foreign_crued.values:
        x_date.append(foreign_crued[0])
        y_foreign_crued.append(foreign_crued[1])
    for foreign_died in ws_foreign_died.values:
        y_foreign_died.append(foreign_died[1])

    fte_map = (
        Line(init_opts=opts.InitOpts(height='420px'))
            .add_xaxis(xaxis_data=x_date)
            .add_yaxis(
            series_name="境外累計確診趨勢",
            y_axis=y_foreign_confirmed,
            label_opts=opts.LabelOpts(is_show=False),
        )
            .add_yaxis(
            series_name="境外累計治癒趨勢",
            y_axis=y_foreign_crued,
            label_opts=opts.LabelOpts(is_show=False),
        )
            .add_yaxis(
            series_name="境外累計死亡趨勢",
            y_axis=y_foreign_died,
            label_opts=opts.LabelOpts(is_show=False),
        )
            .set_global_opts(
            title_opts=opts.TitleOpts(title="境外每日累計確診/治癒/死亡趨勢"),
            legend_opts=opts.LegendOpts(pos_bottom="bottom", orient='horizontal'),
            tooltip_opts=opts.TooltipOpts(trigger="axis"),
            yaxis_opts=opts.AxisOpts(
                type_="value",
                axistick_opts=opts.AxisTickOpts(is_show=True),
                splitline_opts=opts.SplitLineOpts(is_show=True),
            ),
            xaxis_opts=opts.AxisOpts(type_="category", boundary_gap=False),
        )
    )
    return fte_map

【8x00】結果截圖

【8x01】數據儲存 Excel

03

04

【8x02】詞雲圖

05

06

【8x03】地圖 + 折線圖

07

【9x00】完整代碼

預覽地址:http://cov.itrhx.com/

完整代碼地址(點亮 star 有 buff 加成):https://github.com/TRHX/Python3-Spider-Practice/tree/master/COVID-19

其他爬蟲實戰代碼合集(持續更新):https://github.com/TRHX/Python3-Spider-Practice

爬蟲實戰專欄(持續更新):https://itrhx.blog.csdn.net/article/category/9351278


這裏是一段防爬蟲文本,請讀者忽略。
本文原創首發於 CSDN,作者 TRHX。
博客首頁:https://itrhx.blog.csdn.net/
本文鏈接:https://itrhx.blog.csdn.net/article/details/107140534
未經授權,禁止轉載!惡意轉載,後果自負!尊重原創,遠離剽竊!

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章