Atomikos事務恢復流程源碼解析

Atomikos-XA事務恢復

說事務恢復流程之前，我們來討論下，會啥會出現事務恢復？XA二階段提交協議不是強一致性的嗎？要解答這個問題，我們就要來看看XA二階段協議有什麼問題？

問題一：單點故障

由於協調者的重要性，一旦協調者TM發生故障。參與者RM會一直阻塞下去。尤其在第二階段，協調者發生故障，那麼所有的參與者還都處於鎖定事務資源的狀態中，而無法繼續完成事務操作。（如果是協調者掛掉，可以重新選舉一個協調者，但是無法解決因爲協調者宕機導致的參與者處於阻塞狀態的問題）

問題二：數據不一致

數據不一致。在二階段提交的階段二中，當協調者向參與者發送commit請求之後，發生了局部網絡異常或者在發送commit請求過程中協調者發生了故障，這回導致只有一部分參與者接受到了commit請求。而在這部分參與者接到commit請求之後就會執行commit操作。但是其他部分未接到commit請求的機器則無法執行事務提交。於是整個分佈式系統便出現了數據不一致性的現象。

如何解決？

解決的方案簡單，就是我們在事務的操作的每一步，我們都需要對事務狀態的日誌進行人爲的記錄，我們可以把日誌記錄存儲在我們想存儲的地方，可以是本地存儲，也可以中心化的存儲。atomikos的開源版本，我們之前也分析了，它是使用內存 + file的方式，存儲在本地，這樣的話，如果在一個集羣系統裏面，如果有節點宕機，日誌又存儲在本地，所以事務不能及時的恢復（需要重啓服務）。

Atomikos 多場景下事務恢復。

Atomikos 提供了二種方式，來應對不同場景下的異常情況。

場景一：服務節點不宕機，因爲其他的原因，產生需要事務恢復的情況。這個時候纔要定時任務進行恢復。具體的代碼 com.atomikos.icatch.imp.TransactionServiceImp.init() 方法，會初始化一個定時任務，進行事務的恢復。

public synchronized void init ( Properties properties ) throws SysException
    {
        shutdownInProgress_ = false;
        control_ = new com.atomikos.icatch.admin.imp.LogControlImp ( (AdminLog) this.recoveryLog );
		ConfigProperties configProperties = new ConfigProperties(properties);
		long recoveryDelay = configProperties.getRecoveryDelay();  
        recoveryTimer = new PooledAlarmTimer(recoveryDelay);  
        recoveryTimer.addAlarmTimerListener(new AlarmTimerListener() {
			@Override
			public void alarm(AlarmTimer timer) {
				//進行事務恢復
				performRecovery();

			}
		});

        TaskManager.SINGLETON.executeTask(recoveryTimer);
        initialized_ = true;
    }

最終會進入com.atomikos.datasource.xa.XATransactionalResource.recover() 方法。

   public void recover() {
    	XaResourceRecoveryManager xaResourceRecoveryManager = XaResourceRecoveryManager.getInstance();
    	if (xaResourceRecoveryManager != null) { //null for LogCloud recovery
    		try {
				xaResourceRecoveryManager.recover(getXAResource());
			} catch (Exception e) {
				refreshXAResource(); //cf case 156968
			}

    	}
    }

場景二: 當服務節點宕機重啓動過程中進行事務的恢復。具體實現在com.atomikos.datasource.xa.XATransactionalResource.setRecoveryService()方法裏面

 @Override
	public void setRecoveryService ( RecoveryService recoveryService )
            throws ResourceException
    {

        if ( recoveryService != null ) {
            if ( LOGGER.isTraceEnabled() ) LOGGER.logTrace ( "Installing recovery service on resource "
                    + getName () );
            this.branchIdentifier=recoveryService.getName();
         //進行事務恢復
            recover();
        }

    }

com.atomikos.datasource.xa.XATransactionalResource.recover() 流程詳解。

主代碼：

	public void recover(XAResource xaResource) throws XAException {
      // 根據XA recovery 協議獲取 xid
		List<XID> xidsToRecover = retrievePreparedXidsFromXaResource(xaResource);
		Collection<XID> xidsToCommit;
		try {
            // xid 與日誌記錄的xid進行匹配
			xidsToCommit = retrieveExpiredCommittingXidsFromLog();
			for (XID xid : xidsToRecover) {
				if (xidsToCommit.contains(xid)) {
            //執行 XA commit xid 進行提交                 
					replayCommit(xid, xaResource);
				} else {
					attemptPresumedAbort(xid, xaResource);
				}
			}
		} catch (LogException couldNotRetrieveCommittingXids) {
			LOGGER.logWarning("Transient error while recovering - will retry later...", couldNotRetrieveCommittingXids);
		}
	}

我們來看一下如何根據 XA recovery 協議獲取RM端存儲的xid。進入方法 retrievePreparedXidsFromXaResource(xaResource), 最後進入 com.atomikos.datasource.xa.RecoveryScan.recoverXids()方法。

public static List<XID> recoverXids(XAResource xaResource, XidSelector selector) throws XAException {
		List<XID> ret = new ArrayList<XID>();

        boolean done = false;
        int flags = XAResource.TMSTARTRSCAN;
        Xid[] xidsFromLastScan = null;
        List<XID> allRecoveredXidsSoFar = new ArrayList<XID>();
        do {
        	xidsFromLastScan = xaResource.recover(flags);
            flags = XAResource.TMNOFLAGS;
            done = (xidsFromLastScan == null || xidsFromLastScan.length == 0);
            if (!done) {

                // TEMPTATIVELY SET done TO TRUE
                // TO TOLERATE ORACLE 8.1.7 INFINITE
                // LOOP (ALWAYS RETURNS SAME RECOVER
                // SET). IF A NEW SET OF XIDS IS RETURNED
                // THEN done WILL BE RESET TO FALSE

                done = true;
                for ( int i = 0; i < xidsFromLastScan.length; i++ ) {
                	XID xid = new XID ( xidsFromLastScan[i] );
                    // our own XID implements equals and hashCode properly
                    if (!allRecoveredXidsSoFar.contains(xid)) {
                        // a new xid is returned -> we can not be in a recovery loop -> go on
                        allRecoveredXidsSoFar.add(xid);
                        done = false;
                        if (selector.selects(xid)) {
                        	ret.add(xid);
                        }
                    }
                }
            }
        } while (!done);

		return ret;
	}

我們重點關注xidsFromLastScan = xaResource.recover(flags); 這個方法，如果我們使用MySQL，那麼久會進入 MysqlXAConnection.recover()方法。執行 XA recovery xid 語句來獲取 xid。

 protected static Xid[] recover(Connection c, int flag) throws XAException {
        /*
         * The XA RECOVER statement returns information for those XA transactions on the MySQL server that are in the PREPARED state. (See Section 13.4.7.2, ???XA
         * Transaction States???.) The output includes a row for each such XA transaction on the server, regardless of which client started it.
         *
         * XA RECOVER output rows look like this (for an example xid value consisting of the parts 'abc', 'def', and 7):
         *
         * mysql> XA RECOVER;
         * +----------+--------------+--------------+--------+
         * | formatID | gtrid_length | bqual_length | data |
         * +----------+--------------+--------------+--------+
         * | 7 | 3 | 3 | abcdef |
         * +----------+--------------+--------------+--------+
         *
         * The output columns have the following meanings:
         *
         * formatID is the formatID part of the transaction xid
         * gtrid_length is the length in bytes of the gtrid part of the xid
         * bqual_length is the length in bytes of the bqual part of the xid
         * data is the concatenation of the gtrid and bqual parts of the xid
         */

        boolean startRscan = ((flag & TMSTARTRSCAN) > 0);
        boolean endRscan = ((flag & TMENDRSCAN) > 0);

        if (!startRscan && !endRscan && flag != TMNOFLAGS) {
            throw new MysqlXAException(XAException.XAER_INVAL, Messages.getString("MysqlXAConnection.001"), null);
        }

        //
        // We return all recovered XIDs at once, so if not  TMSTARTRSCAN, return no new XIDs
        //
        // We don't attempt to maintain state to check for TMNOFLAGS "outside" of a scan
        //

        if (!startRscan) {
            return new Xid[0];
        }

        ResultSet rs = null;
        Statement stmt = null;

        List<MysqlXid> recoveredXidList = new ArrayList<MysqlXid>();

        try {
            // TODO: Cache this for lifetime of XAConnection
            stmt = c.createStatement();

            rs = stmt.executeQuery("XA RECOVER");

            while (rs.next()) {
                final int formatId = rs.getInt(1);
                int gtridLength = rs.getInt(2);
                int bqualLength = rs.getInt(3);
                byte[] gtridAndBqual = rs.getBytes(4);

                final byte[] gtrid = new byte[gtridLength];
                final byte[] bqual = new byte[bqualLength];

                if (gtridAndBqual.length != (gtridLength + bqualLength)) {
                    throw new MysqlXAException(XAException.XA_RBPROTO, Messages.getString("MysqlXAConnection.002"), null);
                }

                System.arraycopy(gtridAndBqual, 0, gtrid, 0, gtridLength);
                System.arraycopy(gtridAndBqual, gtridLength, bqual, 0, bqualLength);

                recoveredXidList.add(new MysqlXid(gtrid, bqual, formatId));
            }
        } catch (SQLException sqlEx) {
            throw mapXAExceptionFromSQLException(sqlEx);
        } finally {
            if (rs != null) {
                try {
                    rs.close();
                } catch (SQLException sqlEx) {
                    throw mapXAExceptionFromSQLException(sqlEx);
                }
            }

            if (stmt != null) {
                try {
                    stmt.close();
                } catch (SQLException sqlEx) {
                    throw mapXAExceptionFromSQLException(sqlEx);
                }
            }
        }

        int numXids = recoveredXidList.size();

        Xid[] asXids = new Xid[numXids];
        Object[] asObjects = recoveredXidList.toArray();

        for (int i = 0; i < numXids; i++) {
            asXids[i] = (Xid) asObjects[i];
        }

        return asXids;
    }

這裏要注意如果Mysql的版本 <5.7.7 ,則不會有任何數據，在以後的版本中Mysql進行了修復,因此如果我們想要使用MySQL充當RM，版本必須 >= 5.7.7 ，原因是:

MySQL 5.6版本在客戶端退出的時候，自動把已經prepare的事務回滾了，那麼MySQL爲什麼要這樣做？這主要取決於MySQL的內部實現，MySQL 5.7以前的版本，對於prepare的事務，MySQL是不會記錄binlog的（官方說是減少fsync，起到了優化的作用）。只有當分佈式事務提交的時候纔會把前面的操作寫入binlog信息，所以對於binlog來說，分佈式事務與普通的事務沒有區別，而prepare以前的操作信息都保存在連接的IO_CACHE中，如果這個時候客戶端退出了，以前的binlog信息都會被丟失，再次重連後允許提交的話，會造成Binlog丟失，從而造成主從數據的不一致，所以官方在客戶端退出的時候直接把已經prepare的事務都回滾了！

回到主線再從自己記錄的事務日誌裏面獲取XID

  Collection<XID> xidsToCommit = retrieveExpiredCommittingXidsFromLog();

我們來看下獲取事務日誌裏面的XID的retrieveExpiredCommittingXidsFromLog()方法。然後進入com.atomikos.recovery.imp.RecoveryLogImp.getCommittingParticipants()方法。

public Collection<ParticipantLogEntry> getCommittingParticipants()
			throws LogReadException {
		Collection<ParticipantLogEntry> committingParticipants = new HashSet<ParticipantLogEntry>();
		Collection<CoordinatorLogEntry> committingCoordinatorLogEntries = repository.findAllCommittingCoordinatorLogEntries();

		for (CoordinatorLogEntry coordinatorLogEntry : committingCoordinatorLogEntries) {
			for (ParticipantLogEntry participantLogEntry : coordinatorLogEntry.participants) {
				committingParticipants.add(participantLogEntry);
			}
		}
		return committingParticipants;
	}

到這裏我們來簡單介紹一下，事務日誌的存儲結構。首先是 CoordinatorLogEntry,這是一次XA事務的所有信息實體類。

public class CoordinatorLogEntry implements Serializable {

  //全局事務id
 	public final String id;

   //是否已經提交
	public final boolean wasCommitted;

	/**
	 * Only for subtransactions, null otherwise.
	 */
	public final String superiorCoordinatorId;

   //參與者集合
	public final ParticipantLogEntry[] participants;
}

再來看一下參與者實體類 ParticipantLogEntry :

public class ParticipantLogEntry implements Serializable {

	private static final long serialVersionUID = 1728296701394899871L;

	/**
	 * The ID of the global transaction as known by the transaction core.
	 */

	public final String coordinatorId;

	/**
	 * Identifies the participant within the global transaction.
	 */

	public final String uri;

	/**
	 * When does this participant expire (expressed in millis since Jan 1, 1970)?
	 */

	public final long expires;

	/**
	 * Best-known state of the participant.
	 */
	public final TxState state;

	/**
	 * For diagnostic purposes, null if not relevant.
	 */
	public final String resourceName;
}

回到com.atomikos.recovery.xa.DefaultXaRecoveryLog.getExpiredCommittingXids() 方法，可以到獲取了一次XA事務過程中，存儲的事務日誌中的xid。

public Set<XID> getExpiredCommittingXids() throws LogReadException {
		Set<XID> ret = new HashSet<XID>();
		Collection<ParticipantLogEntry> entries = log.getCommittingParticipants();
		for (ParticipantLogEntry entry : entries) {
			if (expired(entry) && !http(entry)) {
				XID xid = new XID(entry.coordinatorId, entry.uri);
				ret.add(xid);
			}
		}
		return ret;
	}

如果從RM中通過XA recovery取出的XID，包含在從事務日誌中取出的XID，則進行commit，否則進行rollback.

List<XID> xidsToRecover = retrievePreparedXidsFromXaResource(xaResource);
		Collection<XID> xidsToCommit;
		try {
			xidsToCommit = retrieveExpiredCommittingXidsFromLog();
			for (XID xid : xidsToRecover) {
				if (xidsToCommit.contains(xid)) {
					replayCommit(xid, xaResource);
				} else {
					attemptPresumedAbort(xid, xaResource);
				}
			}
		} catch (LogException couldNotRetrieveCommittingXids) {
			LOGGER.logWarning("Transient error while recovering - will retry later...", couldNotRetrieveCommittingXids);
		}

replayCommit 方法如下：

private void replayCommit(XID xid, XAResource xaResource) {
		if (LOGGER.isDebugEnabled()) LOGGER.logDebug("Replaying commit of xid: " + xid);
		try {
      //進行事務提交
			xaResource.commit(xid, false);
     //更新事務日誌
			log.terminated(xid);
		} catch (XAException e) {
			if (alreadyHeuristicallyTerminatedByResource(e)) {
				handleHeuristicTerminationByResource(xid, xaResource, e, true);
			} else if (xidTerminatedInResourceByConcurrentCommit(e)) {
				log.terminated(xid);
			} else {
				LOGGER.logWarning("Transient error while replaying commit - will retry later...", e);
			}
		}
	}

attemptPresumedAbort(xid, xaResource); 方法如下：

private void attemptPresumedAbort(XID xid, XAResource xaResource) {
		try {
			log.presumedAborting(xid);
			if (LOGGER.isDebugEnabled()) LOGGER.logDebug("Presumed abort of xid: " + xid);
			try {
         //進行回滾
				xaResource.rollback(xid);
        //更新日誌狀態
				log.terminated(xid);
			} catch (XAException e) {
				if (alreadyHeuristicallyTerminatedByResource(e)) {
					handleHeuristicTerminationByResource(xid, xaResource, e, false);
				} else if (xidTerminatedInResourceByConcurrentRollback(e)) {
					log.terminated(xid);
				} else {
					LOGGER.logWarning("Unexpected exception during recovery - ignoring to retry later...", e);
				}
			}
		} catch (IllegalStateException presumedAbortNotAllowedInCurrentLogState) {
			// ignore to retry later if necessary
		} catch (LogException logWriteException) {
			LOGGER.logWarning("log write failed for Xid: "+xid+", ignoring to retry later", logWriteException);
		}
	}

文章到此，已經寫的很長很多了，我們分析了ShardingSphere對於XA方案，提供了一套SPI解決方案，對Atomikos進行了整合，也分析了Atomikos初始化流程，開始事務流程，獲取連接流程，提交事務流程，回滾事務流程，事務恢復流程。希望對大家理解XA的原理有所幫助。

作者介紹：肖宇，Apache ShardingSphere Committer，開源hmily分佈式事務框架作者，開源soul網關作者，熱愛開源，追求寫優雅代碼。目前就職入京東數科，參與ShardingSphere的開源建設，以及分佈式數據庫的研發工作。

Atomikos事務恢復流程源碼解析

Atomikos-XA事務恢復

問題一：單點故障

問題二：數據不一致

如何解決？

Atomikos 多場景下事務恢復。

com.atomikos.datasource.xa.XATransactionalResource.recover() 流程詳解。

推薦2款開源、美觀的WinForm UI控件庫

NET9 AspnetCore將整合OpenAPI的文檔生成功能而無需三方庫

鯤鵬平臺的銀河麒麟，修復音頻

定時清理服務器上log文件，清理docker日誌文件

安裝筆記本應用商店的pycharm，再安排pandas等模塊，說是沒有打包工具?

智能測試持續加碼，大模型引領軟件測試新生態

DevExpress Office File API中文教程 - 如何用OpenAI模型增強Office文檔可訪問性？

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

Atomikos事務恢復流程源碼解析

Atomikos-XA事務恢復

問題一 ：單點故障

問題二 ：數據不一致

如何解決？

Atomikos 多場景下事務恢復。

com.atomikos.datasource.xa.XATransactionalResource.recover() 流程詳解。

問題一：單點故障

問題二：數據不一致