Python concurrent.futures模塊

Python實現多線程/多進程,大家常常會用到標準庫中的threading和multiprocessing模塊。
但從Python3.2開始,標準庫爲我們提供了concurrent.futures模塊,它提供了ThreadPoolExecutor和ProcessPoolExecutor兩個類,實現了對threading和multiprocessing的進一步抽象,使得開發者只需編寫少量代碼即可讓程序實現並行計算

3.8文檔

下面可以看到Executor是抽象基類

Executor (Abstract Base Class)
│
├── ThreadPoolExecutor
│
│   │A concrete subclass of the Executor class to
│   │manage I/O bound tasks with threading underneath
│
├── ProcessPoolExecutor
│
│   │A concrete subclass of the Executor class to
│   │manage CPU bound tasks with multiprocessing underneath

下面介紹四個方法:

  1. Futures 中的 Executor 類,當我們執行 executor.submit(func) 時,它便會安排裏面的 func() 函數執行,並返回創建好的 future 實例,以便你之後查詢調用
  2. Futures 中的方法 done(),表示相對應的操作是否完成——True 表示完成,False 表示沒有完成。不過,要注意,done() 是 non-blocking 的,會立即返回結果。相對應的 add_done_callback(fn),則表示 Futures 完成後,相對應的參數函數 fn,會被通知並執行調用
  3. Futures 中還有一個重要的函數 result(),它表示當 future 完成後,返回其對應的結果或異常。
  4. 而 as_completed(fs),則是針對給定的 future 迭代器 fs,在其完成後,返回完成後的迭代器

舉例:
單線程版本爬取網頁

#!usr/bin/python
# -*- coding:utf8 -*-
import time
import requests


def download_one(url):
    resp = requests.get(url)
    print('Read {} from {}'.format(len(resp.content), url))


def download_all(sites):
    for site in sites:
        download_one(site)


def main():
    sites = [
         'http://www.zongheng.com/category/1.html',
         'http://www.zongheng.com/category/3.html',
         'http://www.zongheng.com/category/6.html',
         'http://www.zongheng.com/category/9.html',
         'http://www.zongheng.com/category/15.html',
         'http://www.zongheng.com/category/18.html',
         'http://www.zongheng.com/category/21.html',
         'http://www.zongheng.com/category/40.html',
         'http://www.zongheng.com/rank.html',
         'http://book.zongheng.com/store/c0/c0/b0/u0/p1/v9/s1/t0/u0/i1/ALL.html',
         'http://search.zongheng.com/s?keyword=%E9%9B%AA%E4%B8%AD%E6%82%8D%E5%88%80%E8%A1%8C'
    ]
    start_time = time.perf_counter()
    download_all(sites)
    end_time = time.perf_counter()
    print('Download {} sites in {} seconds'.format(len(sites), end_time - start_time))


if __name__ == '__main__':
    main()

多線程版本

import time
import requests
import concurrent.futures


def download_one(url):
    resp = requests.get(url)  # 線程安全的,不會出現race condition
    print('Read {} from {}'.format(len(resp.content), url))


def download_all(sites):
    """
    多線程並行方式:
    max_workers = min(32, (os.cpu_count() or 1) + 4)
    os.cpu_count()==8 max_workers==12 線程的創建、維護和刪除也會有一定的開銷
    executor.map()表示對sites中的每一個元素,併發的調用函數download_one()
    """
    # with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    #     executor.map(download_one, sites)
    """
    多進程併發方式: 
    並行的方式一般用在CPU heavy的場景中,下面的程序並不會提升效率 
    """
    # with concurrent.futures.ProcessPoolExecutor() as executor:
    #     executor.map(download_one, sites)
    """
    方式三:
    """
    with concurrent.futures.ThreadPoolExecutor() as executor:
        to_do = []
        for site in sites:
            future = executor.submit(download_one, site)
            to_do.append(future)
        for future in concurrent.futures.as_completed(to_do):
            future.result()



def main():
    sites = [
         'http://www.zongheng.com/category/1.html',
         'http://www.zongheng.com/category/3.html',
         'http://www.zongheng.com/category/6.html',
         'http://www.zongheng.com/category/9.html',
         'http://www.zongheng.com/category/15.html',
         'http://www.zongheng.com/category/18.html',
         'http://www.zongheng.com/category/21.html',
         'http://www.zongheng.com/category/40.html',
         'http://www.zongheng.com/rank.html',
         'http://book.zongheng.com/store/c0/c0/b0/u0/p1/v9/s1/t0/u0/i1/ALL.html',
         'http://search.zongheng.com/s?keyword=%E9%9B%AA%E4%B8%AD%E6%82%8D%E5%88%80%E8%A1%8C'
    ]
    start_time = time.perf_counter()
    download_all(sites)
    end_time = time.perf_counter()
    print('Download {} sites in {} seconds'.format(len(sites), end_time - start_time))


if __name__ == '__main__':
    main()
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章