Gunicorn（gevent）+ Django HTTP請求處理過程全分析 —— Gunicorn篇【一】

背景

前段時間排查了個內存泄露的故障，花了幾天時間把Gunicorn + Django 從頭到尾看了下。在排查問題時，網上普遍都是零碎的分析文章，需要自己多處拼接與查證，纔可以勉強窺見全貌。於是萌生了寫一篇按照實際流程來梳理的博客，爲這次排查畫上句號。

由於涉及的東西較多，如Gunicorn、wsgi、Django、元類等都可單獨成文，所以將以系列文章的方式來做記錄。

框架&依賴版本如下。

Django 2.1.15
Gunicorn 20.0.4
Python3.x

從啓動命令開始

大部分文章都是直接看代碼，但我覺得不太易懂。從啓動命令開始解析我覺得會更有條理一些。

官方文檔
 Gunicorn Github

根據官方文檔所示，我們的啓動命令如下：

我們先從gunicorn這個入口開始。

gunicorn可執行命令從何而來

gunicorn這個命令是怎麼來的呢？他到底是何方神聖？

從上圖我們可以看出，gunicorn這個命令，是一個去掉了後綴的py腳本。

從這裏可以引申出另外一個知識點（構建python包，有興趣可以自行了解，具體不展開說）：gunicorn的setup.py

這個gunicorn可執行文件就是從這行配置生成的。

順着代碼看下去：

class WSGIApplication(Application):
	.......
	# 根據配置中的路徑import django（app），也就是我們的業務代碼
    def load_wsgiapp(self):
        return util.import_app(self.app_uri)

    def load_pasteapp(self):
        from .pasterapp import get_wsgi_app
        return get_wsgi_app(self.app_uri, defaults=self.cfg.paste_global_conf)
	
	# 加載wsgi（也就是加載django框架生成的wsgi對象）
    def load(self):
        if self.cfg.paste is not None:
            return self.load_pasteapp()
        else:
            return self.load_wsgiapp()
    
def run():
    """\
    The ``gunicorn`` command line runner for launching Gunicorn with
    generic WSGI applications.
    """
    from gunicorn.app.wsgiapp import WSGIApplication
    WSGIApplication("%(prog)s [OPTIONS] [APP_MODULE]").run()

查找Application，跳到 app/base.py:

class Application(BaseApplication):
	.......
    def run(self):
        ........
        if self.cfg.daemon:
            util.daemonize(self.cfg.enable_stdio_inheritance)
        .....
        super().run()

class BaseApplication(object):
    .......
    def wsgi(self):
        if self.callable is None:
            self.callable = self.load()
        return self.callable
    ........
    def run(self):
        try:
            Arbiter(self).run()
        except RuntimeError as e:
            ......

終於，我們到了其他文章一直會提到的Arbiter。

Arbiter

講Arbiter之前，大概講下pre-fork模式。

說白了，就是作爲Master的進程通過fork生成共享listen-fd/accept-fd 的 Worker。

Master保證Worker數量，同時監控Worker的工作狀態，重啓無響應的進程。

class Arbiter(object):
    ......
    def start(self):
        """\
        Initialize the arbiter. Start listening and set pidfile if needed.
        """
        
        .....
        
        if not self.LISTENERS:
            fds = None
            listen_fds = systemd.listen_fds()
            if listen_fds:
                self.systemd = True
                fds = range(systemd.SD_LISTEN_FDS_START,
                            systemd.SD_LISTEN_FDS_START + listen_fds)

            elif self.master_pid:
                fds = []
                for fd in os.environ.pop('GUNICORN_FD').split(','):
                    fds.append(int(fd))

            self.LISTENERS = sock.create_sockets(self.cfg, self.log, fds) #創建所有子進程共享的的listen fd
        ........
        
    def run(self):
        "Main master loop."
        self.start()
        util._setproctitle("master [%s]" % self.proc_name)

        try:
            self.manage_workers() # 保持Worker數量，啓動後Worker數量是0，調用這個函數之後會卡在這裏開始新建子進程，直到滿足配置

            while True:
                self.maybe_promote_master()

                sig = self.SIG_QUEUE.pop(0) if self.SIG_QUEUE else None # 讀取事件（如HUP熱重載）
                if sig is None: # 沒有事件，休眠 & 殺死已經掛了的Worker & 保持進程數不變
                    self.sleep()
                    self.murder_workers()
                    self.manage_workers()
                    continue

                if sig not in self.SIG_NAMES:
                    self.log.info("Ignoring unknown signal: %s", sig)
                    continue

                signame = self.SIG_NAMES.get(sig)
                handler = getattr(self, "handle_%s" % signame, None)
                if not handler:
                    self.log.error("Unhandled signal: %s", signame)
                    continue
                self.log.info("Handling signal: %s", signame)
                handler()
                self.wakeup()
        except (StopIteration, KeyboardInterrupt):
            self.halt()
        except HaltServer as inst:
            self.halt(reason=inst.reason, exit_status=inst.exit_status)
        except SystemExit:
            raise
        except Exception:
            self.log.info("Unhandled exception in main loop",
                          exc_info=True)
            self.stop(False)
            if self.pidfile is not None:
                self.pidfile.unlink()
            sys.exit(-1)
    ........
    def manage_workers(self):
        """\
        Maintain the number of workers by spawning or killing
        as required.
        """
        if len(self.WORKERS) < self.num_workers:
            self.spawn_workers()

        workers = self.WORKERS.items()
        workers = sorted(workers, key=lambda w: w[1].age)
        while len(workers) > self.num_workers:
            (pid, _) = workers.pop(0)
            self.kill_worker(pid, signal.SIGTERM)

        active_worker_count = len(workers)
        if self._last_logged_active_worker_count != active_worker_count:
            self._last_logged_active_worker_count = active_worker_count
            self.log.debug("{0} workers".format(active_worker_count),
                           extra={"metric": "gunicorn.workers",
                                  "value": active_worker_count,
                                  "mtype": "gauge"})

Master進程的功能其實很簡單，就是監控子進程的狀態 & 提供公共的數據（Listen fd）。

下面我們看下master如何拉起子進程。

	def spawn_workers(self):
        """\
        Spawn new workers as needed.

        This is where a worker process leaves the main loop
        of the master process.
        """

        for _ in range(self.num_workers - len(self.WORKERS)):
            self.spawn_worker()
            time.sleep(0.1 * random.random())

manage函數中首先調用的是spawn_workers，從上面可以看出他就是循環調用spawn_worker，拉起後隨機退避等待 0~100ms（防止子進程同時啓動對系統造成過大壓力，每個子進程CPU資源都導致每個子進程都無法完成初始化而被kill。

下面我們看下spawn_worker。


    def spawn_worker(self):
        self.worker_age += 1
        # 這裏的這個self.app，就是之前的WSGIApplication。
        worker = self.worker_class(self.worker_age, self.pid, self.LISTENERS,
                                   self.app, self.timeout / 2.0,
                                   self.cfg, self.log)
        self.cfg.pre_fork(self, worker)
        pid = os.fork()
        if pid != 0:
            worker.pid = pid
            self.WORKERS[pid] = worker
            return pid

        # Do not inherit the temporary files of other workers
        for sibling in self.WORKERS.values():
            sibling.tmp.close()

        # Process Child
        worker.pid = os.getpid()
        try:
            util._setproctitle("worker [%s]" % self.proc_name)
            self.log.info("Booting worker with pid: %s", worker.pid)
            self.cfg.post_fork(self, worker)
            worker.init_process() # worker根據你選擇得種類不同，具體得實現也不相同，但他們都會在此處阻塞，之後會用gevent worker來進行講解
            sys.exit(0)
        except SystemExit:
            raise
        except AppImportError as e:
            self.log.debug("Exception while loading the application",
                           exc_info=True)
            print("%s" % e, file=sys.stderr)
            sys.stderr.flush()
            sys.exit(self.APP_LOAD_ERROR)
        except:
            self.log.exception("Exception in worker process")
            if not worker.booted:
                sys.exit(self.WORKER_BOOT_ERROR)
            sys.exit(-1)
        finally:
            self.log.info("Worker exiting (pid: %s)", worker.pid)
            try:
                worker.tmp.close()
                self.cfg.worker_exit(self, worker)
            except:
                self.log.warning("Exception during worker exit:\n%s",
                                  traceback.format_exc())

調用 worker.init_process() 之後，子進程便開始了工作。init_process具體的行爲會根據worker的不同而不同，下篇文章會以gevent作爲例子來進行講解。

gunicorn其實還是很簡單的，代碼也不多，很適合拿來練習。

Gunicorn（gevent）+ Django HTTP請求處理過程全分析 —— Gunicorn篇【一】

背景

從啓動命令開始

gunicorn可執行命令從何而來

Arbiter

「Pygors跨平臺GUI」2：安裝MinGW-w64、MSYS2還是WSL2

[轉帖]

python列出centos7內存使用前50的進程信息

「Pygors跨平臺GUI」1：Pygors跨平臺GUI應用研究

Java ThreadPoolShutdown

一鍵自動化博客發佈工具,用過的人都說好(掘金篇)

5月21日相聚上海張江！與文心大模型一起共建大模型產業應用生態圈

通義千問 2.5 “客串” ChatGPT4，你分的清嗎？

“她”來了，陪伴賽道鉅變！爲GPT-4o加上你的一個數字分身

京東秒送售後系統退款業務重構心得| 京東零售技術團隊

Django 無法從 request.POST 中獲取URL傳進來的參數

我爲什麼使用JWT

記一次MySQL遇到的奇怪問題

CC2640R2F ADC設置（基於SDK 1.40版本替代ADC_open）

Vue實現標籤 href動態拼接，點擊後使用新窗口打開網頁

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結