sqlalchemy 系列教程五-Session和scopedsession

sqlalchemy 中 session 是什麼, scoped_session 又是什麼 ?

sqlalchemy 中的 session , scoped_session 的區別

摘要 : 本文 主要講解 Session 是什麼, scopedsession

  1. sqlalchemy 中的 session , scoped_session 的區別 是什麼?

  2. 如何進行session 管理

Session 其實 就是一個會話, 可以和數據庫打交道的一個會話.

官方是這樣說明的.

The orm.mapper() function and declarative extensions are the primary configurational interface for the ORM. Once mappings are configured, the primary usage interface for persistence operations is the Session.

orm.mapper()函數和聲明性擴展是ORM的主要配置接口。 配置映射後,持久性操作的主要用法接口是Session.

即用session 來操作數據庫. 通過 session 來實現 增刪改查的操作.

爲了方便說明 首先先定義一個base

#base.py  

from datetime import datetime

from secure import XINYONGFEI_BI_URL
from sqlalchemy import Column, Integer, String

from sqlalchemy import create_engine, DateTime
from sqlalchemy.ext.declarative import declarative_base


engine = create_engine(XINYONGFEI_BI_URL, pool_size=10, pool_recycle=7200,
                       pool_pre_ping=True, encoding='utf-8')





_Base = declarative_base(bind=engine, name='base')


class Base(_Base):
    __abstract__ = True
    __table_args__ = {
        'mysql_engine': 'InnoDB',
        'mysql_charset': 'utf8',
        "useexisting": True
    }

    create_time = Column(DateTime, nullable=False, default=datetime.now)


#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
@Time    : 2019/5/5 18:21
@File    : person.py
@Author  : [email protected]
"""

from base import Base
from sqlalchemy import Column, Integer, String


class Person(Base):
    __tablename__ = 'Person'

    id = Column(Integer, primary_key=True, autoincrement=True)
    name = Column(String(length=64), comment='姓名')
    mobile = Column(String(length=13), comment='手機號')
    id_card_number = Column(String(length=64), comment='身份證')

    def __str__(self):
        return '%s(name=%r,mobile=%r,id_card_number=%r)' % (
            self.__class__.__name__,
            self.name,
            self.mobile,
            self.id_card_number
        )

    __repr__ = __str__



if __name__ == '__main__':
    person = Person(name='frank-' + 'job4', mobile='4444444444', id_card_number='123456789')
    person = Person(name='frank-' + 'job4', mobile='4444444444', id_card_number='123456789')
    person1 = Person(name='frank-' + 'job1', mobile='111111111', id_card_number='11111111111')

    print(person)
    pass

首先創建一個session ,操作一下 Session 對象

>>> from sqlalchemy.orm import sessionmaker, scoped_session

>>> session_factory = sessionmaker(bind=engine)
... 
>>> 
>>> 
>>> person = Person(name='frank-' + 'job3', mobile='111111', id_card_number='123456789')
... 
>>> session= session_factory()
>>> s1= session_factory()
>>> s1
<sqlalchemy.orm.session.Session object at 0x107ec8c18>
>>> s2 = session_factory() 
>>> s1,s2
(<sqlalchemy.orm.session.Session object at 0x107ec8c18>, <sqlalchemy.orm.session.Session object at 0x107ee3ac8>)
>>> s1 is s2 
False
>>> id(s1),id(s2)
(4427910168, 4428020424)

>>> s1.add(person)
>>> s2.add(person)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1835, in add
    self._save_or_update_state(state)
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1848, in _save_or_update_state
    self._save_or_update_impl(state)
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2163, in _save_or_update_impl
    self._save_impl(state)
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2115, in _save_impl
    to_attach = self._before_attach(state, obj)
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2238, in _before_attach
    state.session_id, self.hash_key))
sqlalchemy.exc.InvalidRequestError: Object '<Person at 0x107ec8128>' is already attached to session '11' (this is '12')

當我使用 s2 添加 person 就會報錯了. 說person 這個對象 已經和 另一個session 關聯一起來了, 再次關聯另一個session 就會報錯.

如果s1 ,s2 關聯不同的person就不會有問題.


>>> person4 = Person(name='frank-' + 'job4', mobile='4444444444', id_card_number='123456789')
>>> person1 = Person(name='frank-' + 'job1', mobile='111111', id_card_number='123456789')
>>> 
>>> s1.rollback()
>>> s1.add(person1)
>>> s2.add(person4)
>>> s1.commit()  # 提交數據
>>> s2.commit()  # 提交數據, 寫入數據庫

當執行了 commit 操作的後, 數據纔會寫入數據庫. 此時數據庫裏面已經有了 數據.

即不同的session 是關聯不同的表, 一個session 如果有關聯一個 從base 繼承的對象, 其他的session 是不能夠關聯這個對象的. 就相當於把 session 和 對象綁定在一起了.
此時 數據庫裏面 就有兩條數據了.

#對同一個對象 提交多次,commit 多次,也只會有一條數據寫入數據庫.

提交多次不同的對象可以 寫入多條元素.

爲了方便演示

def get_session_no_scope():
    engine = create_engine(XINYONGFEI_BI_URL, pool_size=5, pool_recycle=7200,
                           pool_pre_ping=True, encoding='utf-8')

    session_factory = sessionmaker(bind=engine)

    session = session_factory()
    return session
    



>>> p= Person(name='frank', mobile='11111111',id_card_number= '123456789')
>>> 
>>> s1 = get_session_no_scope()
>>> s1.add(p)
>>> s1.add(p)
>>> s1.add(p)
>>> s1.commit()


commit 之後 數據庫裏面會有一條記錄,並不會有三條記錄.
而且此時 如果 ,不進行 close,或者回滾, 其他的session 是不能添加 p 這個對象的.


s2 = get_session_no_scope()
s2.add(p)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1835, in add
    self._save_or_update_state(state)
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1848, in _save_or_update_state
    self._save_or_update_impl(state)
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2165, in _save_or_update_impl
    self._update_impl(state)
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2148, in _update_impl
    to_attach = self._before_attach(state, obj)
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2238, in _before_attach
    state.session_id, self.hash_key))
sqlalchemy.exc.InvalidRequestError: Object '<Person at 0x10e677f28>' is already attached to session '5' (this is '6')
s1.close()
s2.add(p)
s2.commit()


s1.close() 之後, s2 就可以順利 add , commit了,但是此時 並沒有新生成 一條數據,到數據庫裏面. 難道同一個對象 只能 commit 一次. 但是此時 數據庫的數據並沒有新增.

測試發現 同一對象, 只能提交到數據庫一次.

engine = create_engine(XINYONGFEI_BI_URL, pool_size=5, pool_recycle=7200,
                       pool_pre_ping=True, encoding='utf-8')

session_factory = sessionmaker(bind=engine)
Session = scoped_session(session_factory)

def get_session_no_scope():
    return session_factory()

>>> s1 = get_session_no_scope()
>>> person = Person(name='frank-' + 'job1', mobile='111111', id_card_number='123456789')
... 
>>> person
Person(name='frank-job1',mobile='111111',id_card_number='123456789')
>>> s1.add(person)
>>> s1.commit()
>>> s1.add(person)
>>> s1.commit()
>>> s1.close()


此時 數據庫裏面會增加 一條數據 , 不會增加 兩條數據. 並且 一旦session.add(person), 其他的session 是不能操作這個對象的. 除非 session.close() 才能 有其他的session 操作這個 對象.

到此 大概清楚了session 是主要 給用戶提供操作 數據庫的接口, 用來進行 增刪改查操作.
給用戶提供比較高級的API ,來操作數據庫, 並且並且 同一個session 只能操作一個對象,

總結:
假設 有兩個session分別爲 s1,s2 , s1.add(person),之後 s2 是不能操作 person 的.
直到 s1.close() ,s2 才能操作person

del session, session2
s1 =session_factory()
s2 =session_factory()
s1,s2
(<sqlalchemy.orm.session.Session object at 0x1115586d8>, <sqlalchemy.orm.session.Session object at 0x112030d30>)

p= Person(name='frank-' + 'job4', mobile='4444444444', id_card_number='123456789')

p.name='frank111'
p
Person(name='frank111',mobile='4444444444',id_card_number='123456789')
s1.add(p)
s1.rollback()
s2.add(p)
s2.rollback()
s1.add(p)
s1.commit()
s2.add(p)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1947, in add
    self._save_or_update_state(state)
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1960, in _save_or_update_state
    self._save_or_update_impl(state)
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2303, in _save_or_update_impl
    self._update_impl(state)
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2286, in _update_impl
    to_attach = self._before_attach(state, obj)
  File "/Users/frank/.local/share/virtualenvs/mysqlalchemy-demo-0htClb7e/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 2374, in _before_attach
    % (state_str(state), state.session_id, self.hash_key)
sqlalchemy.exc.InvalidRequestError: Object '<Person at 0x1120ff208>' is already attached to session '6' (this is '7')
s1.close()
s2.add(p)
s2.commit()

2 scoped_session 是什麼?

用scope_session 生成的對象

engine = create_engine(XINYONGFEI_BI_URL, pool_size=5, pool_recycle=7200,
                       pool_pre_ping=True, encoding='utf-8')

session_factory = sessionmaker(bind=engine)
Session = scoped_session(session_factory)


def get_session():
    return Session()

>>> 
>>> s3= get_session()
>>> p2 = Person(name='frank-2', mobile='222222', id_card_number='123456789')
>>> s4= get_session()

>>> s3 is s4
True
>>> s3.add(p2)
>>> s4.add(p2)
>>> s4.commit()

發現 就可以 用兩個session 同時操作 一個 對象. 打印 id 發現 其實生成的是一個session 同一個id .

這個感覺 有點像 單例模式, 只有一個對象, 一個session 對象.

實際上 不是這樣的. scoped_session 的作用是爲了多線程下面共享 session .

從scoped_session對象 生成 的session 對象,然後我們沒有這樣的問題,因爲scoped_session對象維護同一會話對象的註冊表。

官方文檔對scopesession 介紹如下 https://docs.sqlalchemy.org/en/13/orm/contextual.html#unitofwork-contextual

refer: https://docs.sqlalchemy.org/en/13/orm/contextual.html#unitofwork-contextual


Thread-Local Scope
Users who are familiar with multithreaded programming will note that representing anything as a global variable is usually a bad idea, as it implies that the global object will be accessed by many threads concurrently. The Session object is entirely designed to be used in a non-concurrent fashion, which in terms of multithreading means “only in one thread at a time”. So our above example of scoped_session usage, where the same Session object is maintained across multiple calls, suggests that some process needs to be in place such that multiple calls across many threads don’t actually get a handle to the same session. We call this notion thread local storage, which means, a special object is used that will maintain a distinct object per each application thread. 

Python provides this via the threading.local() construct. The scoped_session object by default uses this object as storage, so that a single Session is maintained for all who call upon the scoped_session registry, but only within the scope of a single thread. Callers who call upon the registry in a different thread get a Session instance that is local to that other thread.

Using this technique, the scoped_session provides a quick and relatively simple (if one is familiar with thread-local storage) way of providing a single, global object in an application that is safe to be called upon from multiple threads.

The scoped_session.remove() method, as always, removes the current Session associated with the thread, if any. However, one advantage of thethreading.local() object is that if the application thread itself ends, the “storage” for that thread is also garbage collected. So it is in fact “safe” to use thread local scope with an application that spawns and tears down threads, without the need to call scoped_session.remove(). However, the scope of transactions themselves, i.e. ending them via Session.commit() or Session.rollback(), will usually still be something that must be explicitly arranged for at the appropriate time, unless the application actually ties the lifespan of a thread to the lifespan of a transaction.

熟悉多線程編程的用戶都知道 任何東西 作爲一個全局變量不是一個 好主意. 這意味着 這個全局的對象可以被多線程併發的訪問. 而 Session 這個對象 設計的時候 僅僅使用在 非併發場景, 這意味着 同一時刻 只用一個線程. 上面 所舉例子, scoped_session ,多次調用 中 維護相同的 Session 對象. 這表明, 在某些進程中需要一個適當的位置 ,就像 多線程調用 不會獲得相同的 session 句柄. 我們把這個概念 叫做 線程本地存儲. 這意味着 使用一個特殊的對象 , 爲每個應用線程 保持着一個不同的對象.

python 提供了 threading.local 構造器. scoped_session 對象 默認是使用這個對象作爲存儲,這樣就可以爲所有調用scoped_session註冊表的對象 維護一個Session,但只能在單個線程的範圍內。 在不同線程中 根據註冊表的調用這個 可以得到一個session 對象, 這個session 對象 對於 其他的線程是 local 的. (理解: 就是說不同的線程 拿到的session 是不一樣的.)

通過這種方式, scoped_session提供了一個快速 並且相對簡單的 提供單個session 方式. 對於多線程的調用來說, 全局對象是安全的.

scoped_session.remove 方法 , 總是, 移除當前的session 和與之關聯的線程的聯繫. 然而, 事務的範圍, 通過session.commit() or sesson.rollback() 來結束. 並且 事務的範圍在合適的時間 可以清晰的管理 , 除非 應用確實希望 線程的生命週期和 事務的生命週期緊緊地綁定在一起.

用多線程的方式來看下 scope_session 到底如何使用的.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
@Time    : 2019/5/5 18:33
@File    : test_session.py
@Author  : [email protected]

# person = Person(name='frank-' + 'job1', mobile='111111', id_card_number='123456789')
session = get_scoped_session()

"""

import threading
import time

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, scoped_session
from secure import XINYONGFEI_BI_URL

from model.person import Person

engine = create_engine(XINYONGFEI_BI_URL, pool_size=5, pool_recycle=7200,
                       pool_pre_ping=True, encoding='utf-8')

session_factory = sessionmaker(bind=engine)
Session = scoped_session(session_factory)


def get_scoped_session():
    """
    scoped_session
    :return:
    """
    return Session()


def get_session_no_scope():
    """ noscoped_session """
    return session_factory()


session = get_scoped_session()


def job(name):
    global session
    # 這裏打印一下 session id  
    print(f"id session:{id(session)}")

    person = Person(name='frank-' + name, mobile='111111', id_card_number='123456789')
    print(f"{name} person is add..")
    session.add(person)

    time.sleep(2)
    if name == 'job3':
        # 線程3 提交 , 其他線程不提交.
        session.commit()


if __name__ == '__main__':

    thread_list = []

    # 創建5個線程
    for i in range(5):
        name = 'job' + str(i)
        t = threading.Thread(target=job, name=name, args=(name,))

        thread_list.append(t)

    for t in thread_list:
        t.start()

    for t in thread_list:
        t.join()

我這裏開啓5 個線程, 我想讓 線程3 進行提交 person對象, 其他的線程不提交.

id session:4541434120
job0 person is add..
id session:4541434120
job1 person is add..
id session:4541434120
job2 person is add..
id session:4541434120
job3 person is add..
id session:4541434120
job4 person is add..

Process finished with exit code 0

運行結果如下: 可以看出 因爲 使用的是 同一個session , id 是一樣的. 但是 只要有一個線程提交了. 其他 的線程 也會受到影響. 這個session 相當於被共享了.

mg2

如果想要實現 線程之間不會相互干擾, 可以每次都生成一個session
每次 任務都起來的時候, 重新創建一個session 就可以了.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
@Time    : 2019/5/5 18:33
@File    : test_session.py
@Author  : [email protected]

# person = Person(name='frank-' + 'job1', mobile='111111', id_card_number='123456789')
session = get_scoped_session()

"""

import threading
import time

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, scoped_session
from secure import XINYONGFEI_BI_URL

from model.person import Person

engine = create_engine(XINYONGFEI_BI_URL, pool_size=5, pool_recycle=7200,
                       pool_pre_ping=True, encoding='utf-8')

session_factory = sessionmaker(bind=engine)
Session = scoped_session(session_factory)


def get_scoped_session():
    """
    scoped_session
    :return:
    """
    return Session()


def get_session_no_scope():
    """ noscoped_session """
    return session_factory()


def job(name):
    # 這裏每次 創建一個新的session 對象
    session = get_scoped_session()
    print(f"id session:{id(session)}")

    person = Person(name='frank-' + name, mobile='111111', id_card_number='123456789')
    print(f"{name} person is add..")
    session.add(person)

    time.sleep(2)
    if name == 'job3':
        # 線程3 提交 , 其他線程不提交.
        session.commit()
        session.close()


if __name__ == '__main__':

    thread_list = []

    # 創建5個線程
    for i in range(5):
        name = 'job' + str(i)
        t = threading.Thread(target=job, name=name, args=(name,))

        thread_list.append(t)

    for t in thread_list:
        t.start()

    for t in thread_list:
        t.join()

結果如下:

id session:4584472304
job0 person is add..
id session:4584866200
job1 person is add..
id session:4584866704
job2 person is add..
id session:4584867208
job3 person is add..
id session:4584867936
job4 person is add..

可以看出 每次創建的 session_id 都不一樣了.

此時數據庫裏面只有一條數據.

img3

scoped_session 默認情況 會根據 線程 創建 不同的session , 同一個 線程下面 創建的session 是一樣的. 不同的線程創建的session 是不一樣的.

默認 情況 下 會調用 ThreadLocalRegistry() 這個對象的call 方法,這裏做了線程隔離,讓不同的線程 拿到不同的session.


用session 來重新測試上面的代碼.

1 先用全局的session , 這樣會導致 每條數據 都會被寫入數據庫.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
@Time    : 2019/5/5 18:33
@File    : test_session.py
@Author  : [email protected]

# person = Person(name='frank-' + 'job1', mobile='111111', id_card_number='123456789')
session = get_session_no_scope()

"""

import threading
import time

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, scoped_session
from secure import XINYONGFEI_BI_URL

from model.person import Person

engine = create_engine(XINYONGFEI_BI_URL, pool_size=5, pool_recycle=7200,
                       pool_pre_ping=True, encoding='utf-8')

session_factory = sessionmaker(bind=engine)
Session = scoped_session(session_factory)


def get_session_no_scope():
    """ noscoped_session """
    return session_factory()


session = get_session_no_scope()


def job(name):
    global session

    print(f"id session:{id(session)}")

    person = Person(name='frank-' + name, mobile='111111', id_card_number='123456789')
    print(f"{name} person is add..")
    session.add(person)

    time.sleep(1)
    if name == 'job3':
        # 線程3 提交, 其他線程不提交.
        session.commit()
        session.close()


if __name__ == '__main__':

    thread_list = []

    # 創建5個線程
    for i in range(5):
        name = 'job' + str(i)
        t = threading.Thread(target=job, name=name, args=(name,))

        thread_list.append(t)

    for t in thread_list:
        t.start()

    for t in thread_list:
        t.join()

結果如下:

id session:4418463896
job0 person is add..
id session:4418463896
job1 person is add..
id session:4418463896
job2 person is add..
id session:4418463896
job3 person is add..
id session:4418463896
job4 person is add..

圖4 所有的數據都會被寫入數據庫.
img4

看出此時 所有的數據都被寫入到數據庫裏面了.

總結

session 和scopedsession 的區別, scoped_session 實現了一個線程的隔離, 保證不同的線程 拿到 不同的session, 同一個線程拿到的session 是同一個值.
session 和scopedsession 本質上都是 用來 操作 數據庫的. 只是session 只適合在單線程下面使用.

參考文檔

1 數據庫連接池SQLAlchemy中多線程安全的問題 https://blog.csdn.net/daijiguo/article/details/79486294
2 sqlalchemy_session http://rhel.cc/2016/07/14/sqlalchemy-session/
3 session 官方地址 https://docs.sqlalchemy.org/en/13/orm/session.html
4 scoped_session https://docs.sqlalchemy.org/en/13/orm/contextual.html#unitofwork-contextual
5 SQLAlchemy 基礎知識 - autoflush 和 autocommit https://zhuanlan.zhihu.com/p/48994990
6 sesssion 官方首頁 https://docs.sqlalchemy.org/en/13/orm/session.html
7 sqlalchemy 常見的問題 https://docs.sqlalchemy.org/en/13/orm/session_basics.html#session-faq-whentocreate
8 多線程下面的使用問題 https://blog.csdn.net/daijiguo/article/details/79486294

分享快樂,留住感動. 2019-05-13 19:19:27 --frank
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章