flask_sqlalchemy中db.session是如何保持請求間獨立的--源碼閱讀筆記

本文主要是爲了驗證兩個問題:

  1. flask處理請求時通過新建線程、進程、協程的區別(順帶一提)
  2. flask_sqlalchemy是如何使用db.session使多個請求中保函的改變同一個表的sql操作不相互影響的,專業名詞是會話範圍或Session作用域(主要探討)

一個簡單的例子

# -*- coding:utf-8 -*-
from sqlalchemy.orm.session import Session # 線程不安全
from sqlalchemy.orm import scoped_session  # 線程安全

import time
from flask_sqlalchemy import SQLAlchemy
from flask import Flask

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'mysql://root:[email protected]:3306/mytest?charset=utf8'
db = SQLAlchemy(app)    # db.init_app(app) 提供了兩種將app與db綁定的方式,具體區別看文檔,這裏不做分析
db_session = db.session


class role(db.Model):
    id = db.Column(db.INT, primary_key=True,autoincrement=True)
    name = db.Column(db.String(99), unique=False)
    name_cn = db.Column(db.String(99), unique=False)

    def __init__(self, name, name_cn):
        self.name = name
        self.name_cn = name_cn

    def __repr__(self):
        return '<User %r>' % self.name

# db.create_all()

@app.route('/add1')
def add1():
    print("db.session:", vars(db_session))
    print("id(db_session)",db_session)
    test_role1 = role('supervisol', '11')
    # test_role2 = role('your try', '11')
    db_session.add(test_role1)
    #db_session.add(test_role2)
    #db.session.commit() # 這裏不去提交
    time.sleep(60)
    return "add1"

@app.route('/add2')
def add2():
    print("db.session:",vars(db.session))
    print("id(db_session)", db_session)
    test_role1 = role('supervisol', '22')
    #test_role2 = role('your try', '22')
    db_session.add(test_role1)
    #db_session.add(test_role2)
    db_session.commit()
    time.sleep(60)
    return "add2"

if __name__ == '__main__':
    app.run(threaded=True)

三種run的方式

# 不使用線程或進程模式時,請求都會發向同一個socket,處理時間會有先後順序,相互影響。(flask會檢查庫中是否有協程greenlet庫,但這裏即使用了也是會影響的,因爲並不是time.sleep不符合協程的要求)
# threaded模式會爲每個進來的請求創建新的線程去處理,請求之間不會相互影響,通過下面的測試就可以瞭解。
# processes 模式就是創建進程。

"""
root@(none):# date ;curl "http://127.0.0.1:5000/add2";date
Tue Aug 14 08:38:14 CST 2018
add2Tue Aug 14 08:39:14 CST 2018

root@(none):~# date ;curl "http://127.0.0.1:5000/add1";date
Tue Aug 14 08:38:16 CST 2018
add1Tue Aug 14 08:39:16 CST 2018

root@(none):~# ps -T -p 8657
  PID  SPID TTY          TIME CMD
 8657  8657 pts/7    00:00:00 python
 8657  8662 pts/7    00:00:00 python
 8657  8666 pts/7    00:00:00 python

"""

db.session的探尋

# db_session在兩個route中不會相互影響,雖然db_session是同一個
# 在 flask_sqlalchemy.SQLAlchemy類的定義中有self.session = self.create_scoped_session(session_options)以及最後返回的
# return orm.scoped_session(self.create_session(options), scopefunc=scopefunc)可以追溯到
# sqlalchemy.orm.session與sqlalchemy.orm.scoped_session的關係
# 可以參考 http://www.cnblogs.com/ctztake/p/8277372.html 會爲每一個請求創建獨立的session由線程id或者
# _app_ctx_stack.__ident_func__爲標記
# 這篇也是很有參考意義的 https://stackoverflow.com/questions/39480914/why-db-session-remove-must-be-called
# 當然看前人的路最方便基本上把前後都說清楚了https://blog.csdn.net/yueguanghaidao/article/details/40016235
"""
# 綁定app然後初始化sql配置
if app is not None:
    self.init_app(app)
    
# 使用鉤子,當請求結束後若沒有配置自動提交,則移除此session
@app.teardown_appcontext
def shutdown_session(response_or_exc):
    if app.config['SQLALCHEMY_COMMIT_ON_TEARDOWN']:
        if response_or_exc is None:
            self.session.commit()

    self.session.remove()
    return response_or_exc  
    
# sqlalchemy.orm.scoping.scoped_session
# sqlalchemy.util._collections.ScopedRegistry 定義
def clear(self):
    #Clear the current scope, if any.
    try:
        del self.registry[self.scopefunc()]
    except KeyError:
        pass
"""

笨辦法print

# sqlalchemy.util._collections.ScopedRegistry 函數加了打印可以看出每次請求進來都是不同的id,已經不同的session去處理
('db.session:', {'session_factory': 127.0.0.1 - - [15/Aug/2018 15:48:19] "GET /add1 HTTP/1.1" 200 -
sessionmaker(class_='SignallingSession', autocommit=False, query_cls=<class 'flask_sqlalchemy.BaseQuery'>, expire_on_commit=True, bind=None, db=<SQLAlchemy engine=mysql://root:***@172.16.4.120:3306/mytest?charset=utf8>, autoflush=True), 'registry': <sqlalchemy.util._collections.ScopedRegistry object at 0x000000000379D748>})
('id(db_session)', <sqlalchemy.orm.scoping.scoped_session object at 0x000000000379D710>)
('1 __call__:', <greenlet.greenlet object at 0x00000000038605A0>)
('2 __call__:', {})
('3 has:', {<greenlet.greenlet object at 0x00000000038605A0>: <sqlalchemy.orm.session.SignallingSession object at 0x00000000038756A0>})
('1 __call__:', <greenlet.greenlet object at 0x00000000038605A0>)
('2 __call__:', {<greenlet.greenlet object at 0x00000000038605A0>: <sqlalchemy.orm.session.SignallingSession object at 0x00000000038756A0>})
('4 clear start:', {<greenlet.greenlet object at 0x00000000038605A0>: <sqlalchemy.orm.session.SignallingSession object at 0x00000000038756A0>})
('5 clear end:', {})


('db.session:', {'session_factory': sessionmaker(class_='SignallingSession', autocommit=False, query_cls=<class 'flask_sqlalchemy.BaseQuery'>, expire_on_commit=True, bind=None, db=<SQLAlchemy engine=mysql://root:***@172.16.4.120:3306/mytest?charset=utf8>, autoflush=True), 'registry': <sqlalchemy.util._collections.ScopedRegistry object at 0x000000000379D748>})
('id(db_session)', <sqlalchemy.orm.scoping.scoped_session object at 0x000000000379D710>)
('1 __call__:', <greenlet.greenlet object at 0x00000000039843D8>)
('2 __call__:', {})
('1 __call__:', <greenlet.greenlet object at 0x00000000039843D8>)
('2 __call__:', {<greenlet.greenlet object at 0x00000000039843D8>: <sqlalchemy.orm.session.SignallingSession object at 0x000000000398DE48>})
127.0.0.1 - - [15/Aug/2018 15:49:29] "GET /add2 HTTP/1.1" 200 -
('3 has:', {<greenlet.greenlet object at 0x00000000039843D8>: <sqlalchemy.orm.session.SignallingSession object at 0x000000000398DE48>})
('1 __call__:', <greenlet.greenlet object at 0x00000000039843D8>)
('2 __call__:', {<greenlet.greenlet object at 0x00000000039843D8>: <sqlalchemy.orm.session.SignallingSession object at 0x000000000398DE48>})
('4 clear start:', {<greenlet.greenlet object at 0x00000000039843D8>: <sqlalchemy.orm.session.SignallingSession object at 0x000000000398DE48>})
('5 clear end:', {})

總結上面的流程

Web Server          Web Framework        SQLAlchemy ORM Code
--------------      --------------       ------------------------------
startup        ->   Web framework        # Session registry is established
                    initializes          Session = scoped_session(sessionmaker())

incoming
web request    ->   web request     ->   # The registry is *optionally*
                    starts               # called upon explicitly to create
                                         # a Session local to the thread and/or request
                                         Session()

                                         # the Session registry can otherwise
                                         # be used at any time, creating the
                                         # request-local Session() if not present,
                                         # or returning the existing one
                                         Session.query(MyClass) # ...

                                         Session.add(some_object) # ...

                                         # if data was modified, commit the
                                         # transaction
                                         Session.commit()

                    web request ends  -> # the registry is instructed to
                                         # remove the Session
                                         Session.remove()

                    sends output      <-
outgoing web    <-
response

重點來了

sqlalchemy是python中最強大的orm框架,無疑sqlalchemy的使用比django自帶的orm要複雜的多,
使用flask sqlalchemy擴展將拉近和django的簡單易用距離。
先來說兩個比較重要的配置

app.config['SQLALCHEMY_ECHO'] = True =》配置輸出sql語句
app.config['SQLALCHEMY_COMMIT_ON_TEARDOWN'] = True =》每次request自動提交db.session.commit(),
如果有一天你發現別的寫的視圖中有db.session.add,但沒有db.session.commit,不要疑惑,他肯定配置了上面的選項。
這是通過app.teardown_appcontext註冊實現
        @teardown
        def shutdown_session(response_or_exc):
            if app.config['SQLALCHEMY_COMMIT_ON_TEARDOWN']:
                if response_or_exc is None:
                    self.session.commit()
            self.session.remove()
            return response_or_exc
response_or_exc爲異常值,默認爲sys.exc_info()[1]
上面self.session.remove()表示每次請求後都會銷燬self.session,爲什麼要這麼做呢?
這就要說說sqlalchemy的session對象了。
from sqlalchemy.orm import sessionmaker
session = sessionmaker()
一幫我們會通過sessionmaker()這個工廠函數創建session,但這個session並不能用在多線程中,爲了支持多線程
操作,sqlalchemy提供了scoped_session,通過名字反映出scoped_session是通過某個作用域實現的
所以在多線程中一幫都是如下使用session
from sqlalchemy.orm import scoped_session, sessionmaker
session = scoped_session(sessionmaker())

我們來看看scoped_session是如何提供多線程環境支持的
class scoped_session(object):
    def __init__(self, session_factory, scopefunc=None):
        
        self.session_factory = session_factory
        if scopefunc:
            self.registry = ScopedRegistry(session_factory, scopefunc)
        else:
            self.registry = ThreadLocalRegistry(session_factory)
__init__中,session_factory是創建session的工廠函數,而sessionmaker就是一工廠函數(其實是定義了__call__的
函數)而scopefunc就是能產生某個作用域的函數,如果不提供將使用ThreadLocalRegistry
class ThreadLocalRegistry(ScopedRegistry):
    def __init__(self, createfunc):
        self.createfunc = createfunc
        self.registry = threading.local()
 
    def __call__(self):
        try:
            return self.registry.value
        except AttributeError:
            val = self.registry.v
從上面__call__可以看出,每次都會創建新的session,併發在線程本地變量中,你可能會好奇__call__是在哪裏調用的?
def instrument(name):
    def do(self, *args, **kwargs):
        return getattr(self.registry(), name)(*args, **kwargs)
    return do
 
for meth in Session.public_methods:
    setattr(scoped_session, meth, instrument(meth))
正如我們所看到的,當我們調用session.query將會調用 getattr(self.registry(), 'query'),self.registry()就是
調用__call__的時機,但是在flask_sqlalchemy中並沒有使用ThreadLocalRegistry,創建scoped_session過程如下
# Which stack should we use?  _app_ctx_stack is new in 0.9
connection_stack = _app_ctx_stack or _request_ctx_stack
 
    def __init__(self, app=None,
                 use_native_unicode=True,
                 session_options=None):
        session_options.setdefault(
            'scopefunc', connection_stack.__ident_func__
        )
        self.session = self.create_scoped_session(session_options)
 
    def create_scoped_session(self, options=None):
        """Helper factory method that creates a scoped session."""
        if options is None:
            options = {}
        scopefunc=options.pop('scopefunc', None)
        return orm.scoped_session(
            partial(_SignallingSession, self, **options), scopefunc=scopefunc
        )
我們看到scopefunc被設置爲connection_stack.__ident_func__,而connection_stack就是flask中app上下文,
如果你看過前一篇文章你就知道__ident_func__其實就是在多線程中就是thrading.get_ident,也就是線程id
我們看看ScopedRegistry是如何通過_操作的
class ScopedRegistry(object):
    def __init__(self, createfunc, scopefunc):
        self.createfunc = createfunc
        self.scopefunc = scopefunc
        self.registry = {}
 
 
    def __call__(self):
        key = self.scopefunc()
        try:
            return self.registry[key]
        except KeyError:
            return self.registry.setdefault(key, self.createfunc())
代碼也很簡單,其實也就是根據線程id創建對應的session對象,到這裏我們基本已經瞭解了flask_sqlalchemy的
魔法了,和flask cookie,g有異曲同工之妙,這裏有兩個小問題?
1.flask_sqlalchemy能否使用ThreadLocalRegistry?
    大部分情況都是可以的,但如果wsgi對多併發使用的是greenlet的模式就不適用了
2.上面create_scoped_session中partial是幹嘛的?
    前面我們說過scoped_session的session_factory是可調用對象,但_SignallingSession類並沒有定義__call__,所以通過partial支持

到這裏你就知道爲什麼每次請求結束要self.session.remove(),不然爲導致存放session的字段太大

這裏說一下對db.relationship lazy的理解,看如下代碼
class Role(db.Model):
    __tablename__ = 'roles'
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(64), unique=True)
    users = db.relationship('User', backref='role', lazy='dynamic')
 
 
class User(db.Model):
    __tablename__ = 'users'
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(64), unique=True, index=True)
    role_id = db.Column(db.Integer, db.ForeignKey('roles.id'))
假設role是已經獲取的一個Role的實例
lazy:dynamic => role.users不會返回User的列表, 返回的是sqlalchemy.orm.dynamic.AppenderBaseQuery對象
                當執行role.users.all()是纔會真正執行sql,這樣的好處就是可以繼續過濾

lazy:select => role.users直接返回User實例的列表,也就是直接執行sql

注意:db.session.commit只有在對象有變化時纔會真的執行update

參考

https://stackoverflow.com/questions/39480914/why-db-session-remove-must-be-called 問題引出
http://www.cnblogs.com/ctztake/p/8277372.html 皮毛
https://blog.csdn.net/yueguanghaidao/article/details/40016235 大佬的足跡
http://docs.sqlalchemy.org/en/latest/orm/contextual.html#using-thread-local-scope-with-web-applications 文檔

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章