python用三種方式實現生產消費模型（進程，線程，協程）

最近用kafka用的比較多，因此對生產消費模型有了不小的興趣，就想着，如果在沒有搭建kafka的情況下，該怎麼實現生產消費模型呢？

前菜

進程：是系統進行資源分配的最小單位，它是程序執行時的一個實例。程序運行時系統就會創建一個進程，併爲它分配資源，然後把該進程放入進程就緒隊列，進程調度器選中它的時候就會爲它分配CPU時間，程序開始真正運行。每個進程都有自己的獨立內存空間，不同進程通過進程間通信來通信。

線程：是程序執行時的最小單位，它是進程的一個執行流，是CPU調度和分派的基本單位，一個進程至少包含一個主線程，也可以由很多個線程組成，線程間共享進程的所有資源，每個線程有自己的堆棧和局部變量。

協程：又稱爲微線程，是一種用戶態的輕量級線程，協程的調度完全由用戶控制。協程擁有自己的寄存器上下文和棧。協程調度切換時，將寄存器上下文和棧保存到其他地方，在切回來的時候，恢復先前保存的寄存器上下文和棧，直接操作棧則基本沒有內核切換的開銷，可以不加鎖的訪問全局變量，所以上下文的切換非常快。在python中用戶調度的方式一開始是通過yield關鍵字來實現的，後面有了asyncio模塊來專門支持協程。

三者之間的關係:
進程包含線程，線程包含協程。

乾貨

下面廢話不多說，直接放代碼

假定應用場景：老周包子鋪，有張三, 李四, 王五三位包子師傅，現在有小紅, 小黃兩個人來喫包子，老周包子鋪的案板上最多隻能容納15個包子，等案板上有15個包子的時候，包子師傅就先不做包子了。

多線程版
思路簡介：設置2個線程池，生產者線程池有3個線程，消費者線程池有2個線程，由於線程是共享進程中的所有資源，因此，可以用一個全局變量隊列來存儲消息。

import threading
import time, queue
import os
from concurrent.futures import ThreadPoolExecutor

q = queue.Queue(maxsize=15)  # 聲明隊列
event = threading.Event()


def Producer(name):
    count = 1
    print("生產者{}線程號爲：{}，進程號爲{}".format(name,threading.get_ident(),os.getpid()))
    while True:
        if q.qsize() == 0:
            baozi = '%s生產的第%s包子' % (name, count)
            print(baozi)
            q.put(baozi)
            count += 1
            time.sleep(1)


def Consumer(name):
    print("消費者{}線程號爲：{}，進程號爲{}".format(name,threading.get_ident(),os.getpid()))
    while True:
        i = q.get()
        print("%s吃了%s" % (name, i))
        print("現在還有%d個包子" % q.qsize())
        time.sleep(1)


def run():
    """
    :return:
    """
    # 3個大廚
    producer_pool = ThreadPoolExecutor(max_workers=3)
    producer_pool.map(Producer, ['張三', '李四', '王五'])

    # 2個客人
    consumer_pool = ThreadPoolExecutor(max_workers=2)
    consumer_pool.map(Consumer, ['小紅', '小黃'])


if __name__ == '__main__':
    run()

多進程版
思路和多線程版差不多，但是進程之間的通信不像線程那麼方便，所以用Manager()進行不同進程之間的通信。

from multiprocessing import Pool, Manager
import time
import os
import threading


def producer(producer_queue, name):
    count = 1
    print("生產者{}線程號爲：{}，進程號爲{}".format(name, threading.get_ident(), os.getpid()))
    while True:
        if producer_queue.qsize() == 0:
            baozi = '%s生產的第%s包子' % (name, count)
            print(baozi)
            producer_queue.put(baozi)
            count += 1
            time.sleep(1)


def consumer(consumer_queue, name):
    print("消費者{}線程號爲：{}，進程號爲{}".format(name, threading.get_ident(), os.getpid()))
    while True:
        i = consumer_queue.get()
        print('%s吃了%s' % (name, i))
        print('現在還有%d個包子' % consumer_queue.qsize())
        time.sleep(1)


def run():
    # 該語句不能隨意移動，否則會報錯:freeze_support()
    deal_message = Manager().Queue(maxsize=15)
    producer_pool = Pool(processes=3)
    consumer_pool = Pool(processes=2)
    producer_list = ['張三', '李四', '王五']
    consumer_list = ['小紅', '小黃']
    for _ in producer_list:
        producer_pool.apply_async(producer, args=(deal_message, _))
    for _ in consumer_list:
        consumer_pool.apply_async(consumer, args=(deal_message, _))

    producer_pool.close()
    consumer_pool.close()
    producer_pool.join()
    consumer_pool.join()


if __name__ == '__main__':
    run()

協程yield版
和之前兩版不一樣，由於協程是用戶自己控制的狀態切換，因此在本例當中，寫了一個簡化的模擬，其實只有2個協程，一個協程充當生產者的角色，一個協程充當消費者的角色，又因爲是自主切換，所以包子案板數量永遠不會超過1個。

import random
import time
import threading
import os

total_count = 0


def consumer():
    consumer_list = ['小紅', '小黃']
    global total_count
    status = True
    while True:
        baozi = yield status
        name = random.choice(list(consumer_list))
        print("消費者{}線程號爲：{}，進程號爲{}".format(name, threading.get_ident(), os.getpid()))
        print('%s吃了%s' % (name, baozi))
        total_count -= 1
        print('現在還有%d個包子' % total_count)
        if total_count >= 15:
            status = False
        time.sleep(1)


def producer(consumer):
    global total_count
    c.send(None)
    producer_dict = {"張三": 0, "李四": 0, "王五": 0}
    while True:
        producer_name = random.choice(list(producer_dict.keys()))
        print("生產者{}線程號爲：{}，進程號爲{}".format(producer_name, threading.get_ident(), os.getpid()))

        baozi = '%s生產的第%s包子' % (producer_name, producer_dict[producer_name] + 1)
        print(baozi)
        producer_dict[producer_name] = producer_dict[producer_name] + 1
        total_count += 1
        print('現在一共有個{}包子'.format(total_count))
        time.sleep(1)
        yield consumer.send(baozi)


if __name__ == '__main__':
    c = consumer()
    p = producer(c)
    for status in p:
        if status == False:
            print('包子案板滿了，暫停生產')
            time.sleep(1)

協程asyncio版
上面的協程使用的yield字段來控制，但是對基於生成器的協程的支持已棄用並計劃在 Python 3.10 中移除，我們使用inspect模塊來查看函數，其實是一個生成器：

import inspect
print('生產者函數是否爲生成器函數:', inspect.isgeneratorfunction(producer))
print('生產者函數是否爲協程函數:', inspect.iscoroutinefunction(producer))
print('消費者函數是否爲生成器函數:', inspect.isgeneratorfunction(consumer))
print('消費者函數是否爲協程函數:', inspect.iscoroutinefunction(consumer))

因此，用asyncio再實現了遍，代碼形式和之前的線程版和進程版就很相似了，而且生產消費函數都是協程函數。

import asyncio
import threading, os

async def producer(name, queue):
    print("生產者{}線程號爲：{}，進程號爲{}".format(name, threading.get_ident(), os.getpid()))
    count = 0
    while True:
        count += 1
        baozi = '%s生產的第%s包子' % (name, count)
        print(baozi.replace('的', '了'))
        await queue.put(baozi)
        await asyncio.sleep(1)


async def consumer(name, queue):
    print("消費者{}線程號爲：{}，進程號爲{}".format(name, threading.get_ident(), os.getpid()))

    while True:
        baozi = await queue.get()
        print('%s吃了%s' % (name, baozi))
        print('現在還有%d個包子' % queue.qsize())
        await asyncio.sleep(1)


async def main():
    queue = asyncio.Queue(maxsize=15)
    await asyncio.gather(
        producer("張三", queue),
        producer("李四", queue),
        producer("王五", queue),
        consumer("小紅", queue),
        consumer("小黃", queue),
    )

if __name__ == '__main__':
    asyncio.run(main())

參考資料：

python用三種方式實現生產消費模型（進程，線程，協程）

前菜

乾貨

python用三種方式實現生產消費模型（進程，線程，協程）

使用python將doc文件轉爲utf8編碼格式的txt

docker命令速查

sentencePiece入門小結

豆瓣最受歡迎的影評爬蟲（第一個爬蟲撒花！）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結