Python concurrent.future線程池和進程池

concurrent.futrues是個高級的的庫，它只在“任務”級別進行操作，意思是你不需要關注同步和線程、進程的管理了。Future 其實是生產-消費者模型的一種擴展，在生產-消費者模型中，生產者不關心消費者什麼時候處理完數據，也不關心消費者處理的結果。你只需要指定一個“max_workers”數量的線程/進程池，然後提交任務和整理結果即可，另一個好處是相對於threading和multiprocessing模塊應用於多線程/多進程場景，頻繁創建/銷燬進程或者線程是非常消耗資源的，而concurrent.futrues有自己的線程池/進程池，以空間換時間。

concurrent.futrues有兩個類：concurrent.futrues.ThreadPoolExecutor（線程池），通常用於IO密集型場景；concurrent.futrues.ProcessPoolExecutor（進程池），通常用於計算密集型場景，爲什麼這樣分使用場景，那是python GIL鎖的原因，多個線程只能用一個CPU，這裏不再贅述了。兩者的使用方法是一樣。

ThreadPoolExecutor/ProcessPoolExecutor常用的方法如下：

1、ThreadPoolExecutor/ProcessPoolExecutor構造實例的時候，傳入max_workers參數來設置線程池中最多能同時運行的線程數目。

2、submit(self, fn, *args, **kwargs)函數來提交線程需要執行的任務（函數名和參數）到線程池中，並返回該任務的句柄（類似於文件、畫圖），注意submit()不是阻塞的，而是立即返回。

3、done()方法判斷該任務是否結束。

4、cancel()方法可以取消提交的任務，如果任務已經在線程池中運行了，就取消不了。

5、result()方法可以獲取任務的返回值。查看內部代碼，發現這個方法是阻塞的。
6、wait(fs, timeout=None, return_when=ALL_COMPLETED)，wait接受3個參數，fs表示執行的task序列；timeout表示等待的最長時間，超過這個時間即使線程未執行完成也將返回；return_when表示wait返回結果的條件，默認爲ALL_COMPLETED全部執行完成再返回
7、map(self, fn, *iterables, timeout=None, chunksize=1)，第一個參數fn是線程執行的函數；第二個參數接受一個可迭代對象；第三個參數timeout跟wait()的timeout一樣，但由於map是返回線程執行的結果，如果timeout小於線程執行時間會拋異常TimeoutError。
8、as_completed(fs, timeout=None)方法一次取出所有任務的結果。
An iterator over the given futures that yields each as it completes.

Args:
fs: The sequence of Futures (possibly created by different Executors) to
iterate over.
timeout: The maximum number of seconds to wait. If None, then there
is no limit on the wait time.

Returns:
An iterator that yields the given Futures as they complete (finished or
cancelled). If any given Futures are duplicated, they will be returned
once.

Raises:
TimeoutError: If the entire result iterator could not be generated
before the given timeout.

下面比較在計算密集型場景下ThreadPoolExecutor和ProcessPoolExecutor的效率：

import time
from concurrent.futures import ThreadPoolExecutor, as_completed, ProcessPoolExecutor

def get_fib(num):
    if num < 3:
        return 1
    return get_fib(num - 1) + get_fib(num - 2)

def run_thread_pool(workers, fib_num):
    start_time = time.time()
    with ThreadPoolExecutor(workers) as thread_executor:
        tasks = [thread_executor.submit(get_fib, num) for num in range(fib_num)]
        results = [task.result() for task in as_completed(tasks)]
        print(results)
        print("ThreadPoolExecutor spend time: {}s".format(time.time() - start_time))

def run_process_pool(workers, fib_num):
    start_time = time.time()
    with ProcessPoolExecutor(workers) as process_executor:
        tasks = [process_executor.submit(get_fib, num) for num in range(fib_num)]
        results = [task.result() for task in as_completed(tasks)]
        print(results)
        print("ProcessPoolExecutor spend time: {}s".format(time.time() - start_time))

if __name__ == '__main__':
    #run_thread_pool(6, 38)
    run_process_pool(6, 38)

結果如下：

[5, 2, 1, 1, 3, 1, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 28657, 17711, 121393, 75025, 46368, 317811, 196418, 514229, 1346269, 832040, 2178309, 3524578, 9227465, 5702887, 14930352, 24157817]
ThreadPoolExecutor spend time: 24.460843086242676s

[1, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 144, 89, 377, 610, 987, 233, 1597, 2584, 4181, 6765, 10946, 17711, 28657, 75025, 46368, 121393, 196418, 514229, 317811, 1346269, 832040, 2178309, 5702887, 3524578, 9227465, 14930352, 24157817]
ProcessPoolExecutor spend time: 15.908910274505615s

Python concurrent.future線程池和進程池

python threading模塊（全）

python獲取列表元素的索引值的方法

python的decimal

python一維插值scipy.interpolate.interp1d

Python concurrent.future線程池和進程池

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結