Python：線程、進程與協程(4)——multiprocessing模塊(1)

multiprocessing模塊是Python提供的用於多進程開發的包，multiprocessing包提供本地和遠程兩種併發，通過使用子進程而非線程有效地迴避了全局解釋器鎖。

（一）創建進程Process 類

創建進程的類，其源碼在multiprocessing包的process.py裏，有興趣的可以對照着源碼邊理解邊學習。它的用法同threading.Thread差不多，從它的類定義上就可以看的出來，如下：

class Process(object):
    '''
    Process objects represent activity that is run in a separate process

    The class is analagous to `threading.Thread`
    '''
    _Popen = None

    def __init__(self, group=None, target=None, name=None, args=(), kwargs={}):
        assert group is None, 'group argument must be None for now'
        count = _current_process._counter.next()
        self._identity = _current_process._identity + (count,)
        self._authkey = _current_process._authkey
        self._daemonic = _current_process._daemonic
        self._tempdir = _current_process._tempdir
        self._parent_pid = os.getpid()
        self._popen = None
        self._target = target
        self._args = tuple(args)
        self._kwargs = dict(kwargs)
        self._name = name or type(self).__name__ + '-' + \
                     ':'.join(str(i) for i in self._identity)

Process([group [, target [, name [, args [, kwargs]]]]])

group實質上不使用,是保留項，便於以後擴展。

target表示調用對象，

args表示調用對象的位置參數元組

kwargs表示調用對象的字典

name爲別名，即進程的名字

它的方法/屬性跟threading.Thread也有很多類似的地方，主要有：

start()：開始進程活動。

run()：表示進程的活動方法，可以在子類中覆蓋它。

join([timeout])：是用來阻塞當前上下文，直至該進程運行結束，一個進程可以被join()多次，timeout單位是秒。

terminate()：結束進程。在Unix上使用的是SIGTERM，在Windows平臺上使用TerminateProcess

is_alive()：判斷進程是否還活着。

name：一個字符串，表示進程的名字，也可以通過賦值語句利用它來修改進程的名字

ident：進程的ID，如果進程沒開始，結果是None

pid：同ident，大家可以看看ident和pid的實現，是利用了os模塊的getpid()方法。

authkey:設置/獲取進程的授權密碼。當初始化多進程時，使用os.urandom()爲主進程分配一個隨機字符串。當創建一個Process對象時，它將繼承其父進程的認證密鑰，但是可以通過將authkey設置爲另一個字節字符串來改變。這裏authkey爲什麼既可以設置授權密碼又可以獲取呢？那是因爲它的定義使用了property裝飾器，源碼如下：

@property
def authkey(self):
   return self._authkey

@authkey.setter
def authkey(self, authkey):
   '''
    Set authorization key of process
   '''
   self._authkey = AuthenticationString(authkey)

這是property的一個高級用法，如果理解了其實也很簡單，有興趣的去查看其它資料。

daemon：一個布爾值，指示進程是(True)否(False)是一個守護進程。它必須在調用start()之前設置，否則會引發RuntimeError。它的初始值繼承自創建它的進程；進程不是一個守護進程，所以在進程中創建的所有進程默認daemon = False。

exitcode：返回進程退出時的代碼。進程運行時其值爲None，如果爲–N，表示被信號N結束。

（1）一個簡單的單進程例子

#coding=utf-8
import multiprocessing
import datetime
import time

def worker(interval):
    n = 5
    while n > 0:
        print "The now is %s"% datetime.datetime.now()
        time.sleep(interval)
        n -= 1

if __name__ == "__main__":

    p = multiprocessing.Process(target = worker, args = (3,))
    p.start()#開始進程
    #p.terminate()#結束進程
    #p.join(9)#阻塞當前上下文
    print "p.authkey",p.authkey#獲取進程的授權密碼
    p.authkey = u"123"#設置進程的授權密碼
    print "p.authkey", p.authkey#獲取進程的授權密碼
    print "p.pid:", p.pid,p.ident#進程ID
    p.name = 'helloworld'#修改進程名字
    print "p.name:", p.name#進程名字
    print "p.is_alive:", p.is_alive()#是否是活的

運行結果如下圖：

上面的代碼有兩行註釋掉的，大家可以把註釋去掉，體會、理解這兩個方法的用處，在此不貼我的運行結果了。

(2)自定義進程類，並開啓多個進程

import multiprocessing
import datetime
import time

class MyProcess(multiprocessing.Process):
    """
    自定義進程類
    """
    def __init__(self,interval,group=None,target=None,name=None,args=(),kwargs={}):
        multiprocessing.Process.__init__(self,group,target,name,args,kwargs=kwargs)
        self.interval = interval

    def run(self):
        n = 5
        while n > 0:
            print("the time is %s"%datetime.datetime.now())
            time.sleep(self.interval)
            n -= 1


def worker_1(interval):
    print "worker_1"
    time.sleep(interval)
    print "end worker_1"

def worker_2(interval):
    print "worker_2"
    time.sleep(interval)
    print "end worker_2"

def worker_3(interval):
    print "worker_3"
    time.sleep(interval)
    print "end worker_3"


if __name__ == "__main__":
    p1 = MyProcess(interval=2,target = worker_1, args = (2,))
    p2 = MyProcess(interval=2,target = worker_2, args = (3,))
    p3 = MyProcess(interval=2,target = worker_3, args = (4,))

    p1.start()
    p2.start()
    p3.start()
    print "current process",multiprocessing.current_process(),multiprocessing.active_children()
    print("The number of CPU is:" + str(multiprocessing.cpu_count()))
    for p in multiprocessing.active_children():
        print("child   p.name:" + p.name + "\tp.id" + str(p.pid))
    print "END!!!!!!!!!!!!!!!!!"

運行結果如下：

看看打印出來的時間，三個進程應該是並行執行的。

（二）進程間通信

multiprocessing模塊支持兩種進程間的通信方式：Queue(隊列)和Pipe(管道)。

（1）Queue

multiprocessing中的Queue類的定義在queues.py文件裏。和Queue.Queue差不多，multiprocessing中的Queue類實現了Queue.Queue的大部分方法（可以參考上篇博文Python：線程、進程與協程(3)——Queue模塊及源碼分析），但task_done()和join()沒有實現，主要方法和屬性有

qsize()：返回Queue的大小

empty()：返回一個布爾值，表示Queue是否爲空

full()：返回一個布爾值，表示Queue是否滿

put(item[, block[, timeout]])：向隊列裏添加元素item,block設置爲False的時候，如果隊列滿了則拋出Full異常。如果block設置爲True，timeout設置爲None時，則會一種等到有空位的時候再添加進隊列；否則會根據timeout設定的超時值拋出Full異常。

put_nowait(item)：等價與put(item,False)。

get([block[, timeout]])：從隊列中刪除元素並返回該元素的值，如果timeout是一個正數，它會阻塞最多超時秒數，並且如果在該時間內沒有可用的項目，則引發Empty異常。

get_nowait()：等價於get(False)

close()：表示該Queue不在加入新的元素

join_thread()：加入後臺線程。這只能在調用close（）之後使用。它阻塞直到後臺線程退出，確保緩衝區中的所有數據都已刷新到管道。默認情況下，如果進程不是隊列的創建者，則退出，它將嘗試加入隊列的後臺線程。該進程可以調用cancel_join_thread（）來做

cancel_join_thread()：在阻塞中阻止join_thread()，防止後臺線程在進程退出時被自動連接，肯能會導致數據丟失。

（2）Pipe

Pipe不是類，是函數，該函數定義在 multiprocessing中的connection.py裏，函數原型Pipe(duplex=True)，

返回一對通過管道連接的連接對象conn1和conn2。

如果duplex是True（默認值），則管道是雙向的。

如果duplex是False，則管道是單向的：conn1只能用於接收消息，conn2只能用於發送消息。

Pipe()返回的兩個連接對象表示管道的兩端，每個連接對象都有send()和recv()方法（還有其它方法），分別是發送和接受消息。下面舉個簡單的例子，一個發送數據，一個接受數據

#coding=utf-8
import multiprocessing
import time

def proc1(pipe):
    """
    發送數據
    """
    while True:
        for i in xrange(100):
            print "send: %s" %(i)
            pipe.send(i)#發送數據
            time.sleep(1)

def proc2(pipe):
    """
    接收數據
    """
    while True:
        print "proc2 rev:", pipe.recv()#接受數據
        time.sleep(1)
if __name__ == "__main__":
    pipe1,pipe2 = multiprocessing.Pipe()#返回兩個連接對象
    p1 = multiprocessing.Process(target=proc1, args=(pipe1,))
    p2 = multiprocessing.Process(target=proc2, args=(pipe2,))

    p1.start()
    p2.start()

    p1.join()
    p2.join()

運行結果如下：

(三)進程間的同步

multiprocessing包含與threading中所有同步原語等同的原語,它也有Lock，RLock，Even，Condition，Semaphore,BoundedSemaphore。用法都差不多，它們的定義在 multiprocessing包的synchronize.py文件裏，在此不過多介紹，有興趣的可以參考Python：線程、進程與協程(2)——threading模塊裏相關的概念理解。如果理解了相關概念，在 multiprocessing模塊中使用是一樣的，看下面這個簡單的例子吧，有兩個進程要向某個文件中寫入內容，爲了避免訪問衝突，可以使用鎖。

#coding=utf-8
import multiprocessing
def worker_with(lock, f):
    with lock:#Lock等對象也是支持上下文管理器協議的。
        fs = open(f, 'a+')
        n = 10
        while n > 1:
            fs.write("Lockd acquired via with\n")
            n -= 1
        fs.close()

def worker_no_with(lock, f):
    lock.acquire()
    try:
        fs = open(f, 'a+')
        n = 10
        while n > 1:
            fs.write("Lock acquired directly\n")
            n -= 1
        fs.close()
    finally:
        lock.release()

if __name__ == "__main__":
    lock = multiprocessing.Lock()#定義鎖
    f = "/home/liulonghua/files.txt"
    w = multiprocessing.Process(target = worker_with, args=(lock, f))
    nw = multiprocessing.Process(target = worker_no_with, args=(lock, f))
    w.start()
    nw.start()
    print "end"

multiprocessing提供了threading包中沒有的IPC(比如Pipe和Queue)，效率上更高。應優先考慮Pipe和Queue，避免使用Lock/Event/Semaphore/Condition等同步方式 (因爲它們佔據的不是用戶進程的資源)。

多進程應該避免共享資源。在多線程中，我們可以比較容易地共享資源，比如使用全局變量或者傳遞參數。在多進程情況下，由於每個進程有自己獨立的內存空間，以上方法並不合適。此時我們可以通過共享內存和Manager的方法來共享資源。但這樣做提高了程序的複雜度，並因爲同步的需要而降低了程序的效率。下篇博文再接着講進程共享和進程池等。

Python：線程、進程與協程(4)——multiprocessing模塊(1)

再談23種設計模式（3）：行爲型模式（學習筆記）

Power Automate Desktop 安裝完，登錄後老是提示one driver 錯誤

微前端學習筆記(4):從微前端到微模塊之EMP與hel-micro方案探索

微前端學習筆記（1）：微前端總體架構概述，從微服務發微

985 碩士程序員，空窗 4 個月沒有 Offer！

一文搞懂 Spring 循環依賴

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

VScode右鍵打開(添加到右鍵)

記一次 .NET某工控視覺自動化系統卡死分析

WindowsServer--SQL Server搭建主從同步實現讀寫分離 - 事務性分發

提高Python性能的一些建議

Python：使用Counter進行計數統計及collections模塊

Python：HTMLParser模塊進行簡單的html解析

Python:使用copy模塊深拷貝對象

Python字符串的基本用法總結

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結