PostgreSQL嵌套事务提交流程研究

父事务最终提交代码流程图

事务提交调用流程

其中值得拿出来讲的，主要是TransactionIdSetTreeStatus这个方法。

这里涉及到一个概念，子事务。在PG这个地方，子事务的概念主要指：事务从开始到结束，期间可以

savepoint，之后rollback到savepoint而不是事务起点，在实际情况中多有应用，因此这里父事务与子事务（比如事务最终提交，但期间有回滚的情况，或者事务期间多次savepoint）必须尽可能原子性的方式写入，否则事务可见性就会出现问题。

在代码注释里面，对这里的写入做了一个比较直观的例子：

比如一个事务t，有子事务
t1，t2，t3，t4，其中t，t1被映射到clog页p1，t2和t3在p2，t4在页p3。那么写入的时候，顺序如下：

设置p2 的t2 t3为子提交，之后设置p3的t4位子提交
设置t1为子提交，之后设置t为已提交，之后设置t1为已提交
设置 t2 t3 为已提交，设置t4位已提交

对于回滚，实际上也是调用TransactionIdSetTreeStatus方法，只是上层函数是TransactionIdAbortTree，设置的标记是TRANSACTIONSTATUSABORTED，也就是记录事务为中断。语义上来说，对于事务中断，由于事务的原子性要求，中断的事务数据就是不可见的了，没啥问题。

TransactionIdSetTreeStatus()讲解

代码注释

在提交日志中记录事务条目的最终状态

事务及其子事务树。注意确保这是

高效且尽可能原子。

xid是用于设置状态的单个xid。通常是

顶级提交或中止的顶级transactionid。它可以

当我们记录事务中止时，它也是子事务。

subxids是长度为nsubxids的xid数组，表示子事务

在xid树中。在各种情况下，nsubxids可以为零。

记录异步时，lsn必须是提交记录的WAL位置

提交。对于同步提交，它可以是InvalidXLogRecPtr，因为

调用者保证在这种情况下已经清除了提交记录。它

对于中止情况，也应为InvalidXLogRecPtr。

在提交的情况下，原子性受是否所有子轴都在

与xid相同的CLOG页面。如果全部都是，那么锁将被抓住

仅一次，状态将设置为直接提交。除此以外

我们必须

1.设置子提交的所有与子目录不在同一页面上的子目录

主要的xid

2.在同一页面上自动设置提交的主xid和子xid

3.再次遍历第一束并将其设置为已提交

请注意，就并发检查程序而言，主要交易

整个提交仍然是原子的。

示例：

TransactionId t提交并具有子下标t1，t2，t3，t4

t在页面p1上，t1也在p1上，t2和t3在p2上，t4在p3上

1.更新第2-3页：

第2页：将t2，t3设置为子提交

第3页：将t4设为子提交

2.更新第1页：

将t1设为子提交，

然后将t设置为commit，

然后将t1设置为commit

3.更新第2-3页：

第2页：将t2，t3设置为已提交

第3页：将t4设置为已提交

注意：这是一个低级例程，不是首选入口点

用于大多数用途； transam.c中的函数是预期的调用方。

XXX考虑在需要的页面上发布FADVISE_WILLNEED，

但尚未缓存，并提示页面不会掉出

尚未缓存。

分析

这里涉及到一个概念：子事务。在PG这个地方，子事务的概念主要指：事务从开始到结束，期间可以savepoint，之后rollback到savepoint而不是事务起点，在实际情况中多有应用，因此这里父事务与子事务（比如事务最终提交，但期间有回滚的情况，或者事务期间多次savepoint）必须尽可能原子性的方式写入，否则事务可见性就会出现问题。

TransactionIdSetTreeStatus的代码注释中，对这里的写入做了一个比较直观的例子：

比如一个事务t，有子事务
t1，t2，t3，t4，其中t，t1被映射到clog页p1，t2和t3在p2，t4在页p3。那么写入的时候，顺序如下：

设置p2 的t2 t3为子提交，之后设置p3的t4位子提交
设置t1为子提交，之后设置t为已提交，之后设置t1为已提交
设置 t2 t3 为已提交，设置t4位已提交

对于回滚，实际上也是调用TransactionIdSetTreeStatus方法，只是上层函数是

TransactionIdAbortTree，设置的标记是TRANSACTIONSTATUSABORTED，也就是记录事务为中断。语义上来说，对于事务中断，由于事务的原子性要求，中断的事务数据就是不可见的了，没啥问题。

子事务subtrans？

当我们使用savepoint时，会产生子事务，子事务和父事务一样，可能消耗XID。一旦为子事务分配了XID，那么就涉及CLOG的原子操作了。因为要保证父事务和所有的子事务的CLOG一致性。

当不消耗XID时，需要通过SubTransactionId来区分子事务。

参考： src/backend/access/transam/README

《Transaction and Subtransaction Numbering》

事务和子事务都可以有XID，子事务和事务一样，在真正需要XID的时候才会分配XID，

也就是说，一个事务，如果它有子事务，可能消耗多个XID。

另外需要注意，如果子事务要分配XID，必须先给它的父事务分配一个XID，才能给子事务分配XID，因为要确保子事务的XID是在父事务后分配的。

README原文

Transaction and Subtransaction Numbering

----------------------------------------

Transactions and subtransactions are assigned permanent XIDs only when/if

they first do something that requires one — typically, insert/update/delete

a tuple, though there are a few other places that need an XID assigned.

If a subtransaction requires an XID, we always first assign one to its

parent. This maintains the invariant that child transactions have XIDs later

than their parents, which is assumed in a number of places.

The subsidiary actions of obtaining a lock on the XID and entering it into

ux_subtrans and UX_PROC are done at the time it is assigned.

A transaction that has no XID still needs to be identified for various

purposes, notably holding locks. For this purpose we assign a "virtual

transaction ID" or VXID to each top-level transaction. VXIDs are formed from

two fields, the backendID and a backend-local counter; this arrangement allows

assignment of a new VXID at transaction start without any contention for

shared memory. To ensure that a VXID isn’t re-used too soon after backend

exit, we store the last local counter value into shared memory at backend

exit, and initialize it from the previous value for the same backendID slot

at backend start. All these counters go back to zero at shared memory

re-initialization, but that’s OK because VXIDs never appear anywhere on-disk.

Internally, a backend needs a way to identify subtransactions whether or not

they have XIDs; but this need only lasts as long as the parent top transaction

endures. Therefore, we have SubTransactionId, which is somewhat like

CommandId in that it’s generated from a counter that we reset at the start of

each top transaction. The top-level transaction itself has SubTransactionId 1,

and subtransactions have IDs 2 and up. (Zero is reserved for

InvalidSubTransactionId.) Note that subtransactions do not have their

own VXIDs; they use the parent top transaction’s VXID.

子事务日志

嵌套事务形成了一个事务树，因此只需要通过指定事务，在子事务日志中逐级向上回溯寻找其父事务，直到遇到一个事务的父事务id为无效事务id，则说明该事务为所要寻找的根事务id。

subtrans日志的健壮性要求和clog日志是完全不同的，因为需要记录的只是当前打开事务的子事务信息。所以在系统奔溃或重启时并不保存数据。由于在系统奔溃时不需要保存数据，因此也不需要和xlog进行交互，也没有对应的redo函数。在数据库启动时，只要使当前活跃的子事务页面为全0就可以了。

分析

父子关系的嵌套事务日志，即： ux_subtrans 日志

在Clog同步的时候，通过将父子事务以原子化的方式，一并修改是否已提交的状态

有关事务的父子嵌套关系，其他节点只需要了解其是否最终提交，忽略其父子关系

子事务的提交状态分为”sub-commited”和”最终commited”，即其他节点只需了解所要xid（包括sub-xid）的提交状态为“最终committed”，而忽略“sub-commited”。

参考

PostgreSQL的clog—从事务回滚速度谈起

https://ssl.zzidc.com/chanpinzixun/2019/0801/654.html

pg_clog的原子操作与pg_subtrans(子事务)

https://blog.csdn.net/postgrechina/article/details/49130709

PostgreSQL嵌套事务提交流程研究

PostgreSQL嵌套事务提交流程研究

父事务最终提交代码流程图

事务提交调用流程

TransactionIdSetTreeStatus()讲解

代码注释

分析

子事务subtrans？

README原文

子事务日志

分析

参考

SQL优化-20231016

buffer和cache區別的簡單理解

vscode中"無法查看c/c++源代碼光標所在的函數名稱"問題解決

被面試官問“Mysql”，update 語句到底做了些什麼？

可串行性與“嚴格”可串行化區別

PostgreSQL系統概述_PG數據庫內核分析學習筆記

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結