【python3】asyncio：異步的同步寫法

原創

当白

2020-06-24 12:04

基本概念：

Asynchronous I/O(異步 I/O)。當代碼需要執行一個耗時的 I/O 操作的時候, 它只發出 I/O 的指令, 並不等待 I/O 的結果, 然後去執行其它的代碼, 以提高效率。
event loop(事件循環)。把基本的 I/O 操作轉換爲需要處理的事件, 通過事件循環做事件的監測和事件觸發等處理工作。
coroutines(協程)。線程是由操作系統控制切換的, 使用協程可以收回控制權, 並且將異步編程同步化, 註冊到事件循環中的事件處理器就是協程對象, 它由事件循環來調用, 當程序阻塞等待讀取或者寫入數據的時候, 進行上下文的切換可以讓效率最大化。
tasks(任務)。asyncio 模塊非常容易和方便的執行併發任務, 並且可以實現創建、取消等管理任務。

我們用的比較多的就是攜程，異步轉同步的內容：

asyncio 的事件循環有多種方法啓動協程, 最簡單的方案是 run_until_complete():

import asyncio

async def coroutine(): # 使用 async 創建一個協程

print('in coroutine')

return 'result'

if __name__ == '__main__':

event_loop = asyncio.get_event_loop() # 創建一個默認的事件循環

try:

print('starting coroutine')

coro = coroutine()

print('entering event loop')

result = event_loop.run_until_complete(coro) # 通過調用事件循環的 run_until_complete() 啓動協程

print(f'it returned: {result}')

finally:

print('closing event loop')

event_loop.close() # 關閉事件循環

# 輸出:

starting coroutine

entering event loop

in coroutine

it returned: result

closing event loop

總結一下幾個關鍵點：

event_loop = asyncio.get_event_loop()
result =event_loop.run_until_complete(用async修飾的函數)
event_loop.close() # 關閉事件循環

在上邊的例子上拓展，再加一個實際上的使用場景：

import asyncio

async def main():

print('waiting for chain1')

result1 = await chain1()

print('waiting for chain2')

result2 = await chain2(result1)

return (result1, result2)

async def chain1():

print('chain1')

return 'result1'

async def chain2(arg):

print('chain2')

return f'Derived from {arg}'

if __name__ == '__main__':

event_loop = asyncio.get_event_loop()

try:

return_value = event_loop.run_until_complete(main())

print(f'return value: {return_value}')

finally:

event_loop.close()

# 輸出:

waiting for chain1

chain1

waiting for chain2

chain2

return value: ('result1', 'Derived from result1')

然後我們再此基礎上進一步拓展：

先補充一下run_in_executor的參數

awaitable loop.run_in_executor(executor, func, *args)

參數 :

executor 可以是 ThreadPoolExecutor / ProcessPool , 如果是None 則使用默認線程池

func：需要執行的方法

*args帶的參數

領導讓寫一個爬蟲，小李是一個老搬磚工，一個爬蟲而已，輕車熟路啦，reqeusts請求，BeautifulSoup解析，最後保存數據完工。

因爲最近看了asyncio，他分析了一下，決定使用多線程發起網絡請求，多進程來解析dom，最後保存依舊使用多線程的方式來實現。

class XiaoLiSpider(object):

def __init__(self) -> None:

self._urls = [

"http://www.fover.cn/da/"

"http://www.fover.cn/sha/"

"http://www.fover.cn/bi/"

]

self._loop = asyncio.get_event_loop()

self._thread_pool = concurrent.futures.ThreadPoolExecutor(max_workers=10)

self._process_pool = concurrent.futures.ProcessPoolExecutor()

   #我們最終要執行的方法，爬

def crawl(self):

a = []

for url in self._urls:

a.append(self._little_spider(url))

self._loop.run_until_complete(asyncio.gather(*a))

self._loop.close()

self._thread_pool.shutdown()

self._process_pool.shutdown()

   #爬蟲

async def _little_spider(self, url):

response = await self._loop.run_in_executor(

self._thread_pool, self._request, url)

urls = await self._loop.run_in_executor(self._process_pool,

self._biu_soup,

response.text)

self._save_data(urls)

print(urls)

def _request(self, url):

return requests.get(url=url,timeout=10)

@classmethod

def _biu_soup(cls, response_str):

soup = BeautifulSoup(response_str, 'lxml')

a_tags = soup.find_all("a")

a = []

for a_tag in a_tags:

a.append(a_tag['href'])

return a

def _save_data(self,urls):

#保存思路如上，這裏偷懶了大家根據實際需要寫

       pass

再總結一下幾個關鍵點：

1、asyncio.gather(*a) 這裏將任務做個集合

2、run_in_executor()是一個好方法，可以方便的讓我們使用線程池和進程池

3、注意在使用進程池執行任務的時候，需要加上 @classmethod裝飾，因爲多進程不共享內存，當然網上有更加詳細的解釋，大家可以上網搜一下具體的資料，我這裏簡答的描述爲由於內存共享問題，所以多進程調用方法必須是無副作用的。

4、用完記得關。close()、shutdown() 牢記心間，要不然定時任務跑多了，你會發現一堆進程在那邊喫着cpu耗着memory在看戲

加速Asyncio

uvloop，這個使用庫可以有效的加速asyncio，本庫基於libuv，也就是nodejs用的那個庫。

學網絡I/O的時候難免會碰到這樣或那樣的異步IO庫，比如libevent、libev、libuv

Libevent、libev、libuv三個網絡庫，都是c語言實現的異步事件庫Asynchronousevent library）

github地址：

github.com/MagicStack/uvloop

使用它也非常方便

import asyncio

import uvloop

asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())

沒錯就是2行代碼，就可以提速asyncio，效果大概是這樣的：

小李看完捋了捋自己的頭髮，想着如果有一天自己的腦袋也像老王這樣油亮光滑，自己的編程水平也和他差不多了吧。

POST：https://www.jianshu.com/p/71b90a578668

POST： https://zhuanlan.zhihu.com/p/34578049

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【python3】asyncio：異步的同步寫法

加速Asyncio

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU啓動那些事（12.A）- uSDHC eMMC啓動時間(RT1170)

GPT-4o 引領人機交互新風向，向量數據庫賽道沸騰了

企業大模型如何成爲自己數據的“百科全書”？

本地SSL證書過期輸入命令在IIS自動生成

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（二）使用kube-vip實現集羣VIP訪問

.NET週刊【5月第2期 2024-05-12】

品設和人設以及向下思考

【js】generator和thunk

【python3】快速上手(3) : 數組和字典

【python3】快速上手(5) : 封裝和繼承

【node】koa2.x 微信公衆平臺開發框架

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結