Druid無效鏈接回收策略(源碼分析)(mysql 8小時連接失效問題)

問題背景(異常Communications link failure)

最近添加了數據庫監控後發現會有幾十萬分之一概率查詢失敗. 查看日誌發現異常如下 :
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
監控如下:
在這裏插入圖片描述
連接池使用的dbcp 1.4版本. 查了一下同學們講是mysql 連接如果8小時內持續空閒會被關閉.
通過mysql > show variables like '%timeout%';查到結果確實如此
在這裏插入圖片描述
隨即按照大家講的將dbcp更換成了druid (1.1.2) 連接池.問題得到解決.另外修改mysql設置也可以好像, 由於集團彈性數據庫修改起來流程比較繁瑣. 沒有深究

druid數據庫連接池關鍵配置說明(注意標紅配置)

initialSize: 初始化連接個數
maxActive: 最大連接池數量
minIdle: 最小連接池數量
validationQuery: 用來檢測連接是否有效的sql,要求是一個查詢語句。如果validationQuery爲null,testOnBorrow、testOnReturn、testWhileIdle都不會其作用。
testOnBorrow: false 申請連接時不執行validationQuery檢測
testOnReturn: false 歸還連接時不執行validationQuery檢測
testWhileIdle: true 申請連接的時候檢測,如果空閒時間大於timeBetweenEvictionRunsMillis,執行validationQuery檢測
:timeBetweenEvictionRunsMillis 有兩個含義:
1.testWhileIdle的判斷依據,詳細看testWhileIdle屬性的說明 對應以下第一種方式
2.Destroy線程會檢測連接的間隔時間 對應以下第二種方式
minEvictableIdleTimeMillis: Destroy worker執行時判斷連接空閒時間是否大於 minEvictableIdleTimeMillis, 如果大於判斷線程池中空閒連接數是否大於minIdle, 如果大於回收此連接
maxEvictableIdleTimeMillis: Destroy worker執行時判斷連接空閒時間是否大於 maxEvictableIdleTimeMillis, 如果大於回收此連接(忽略minIdle).

druid數據庫連接池超時連接回收源碼分析

第一種方式 : 獲取連接時校驗

   public DruidPooledConnection getConnection() throws SQLException { //獲取連接方法
        return this.getConnection(this.maxWait); 
    }

    public DruidPooledConnection getConnection(long maxWaitMillis) throws SQLException {
        this.init();
        if(this.filters.size() > 0) {
            FilterChainImpl filterChain = new FilterChainImpl(this);
            return filterChain.dataSource_connect(this, maxWaitMillis);
        } else {
            return this.getConnectionDirect(maxWaitMillis);
        }
    }
    public DruidPooledConnection getConnectionDirect(long maxWaitMillis) throws SQLException {
        int notFullTimeoutRetryCnt = 0;

        DruidPooledConnection poolableConnection;
        while(true) { //注意這裏的循環.直到獲取到或者拋出或者break; while循環纔會出
            while(true) {
                try {
                    poolableConnection = this.getConnectionInternal(maxWaitMillis);//獲取
                    break;
                } catch (GetConnectionTimeoutException var23) {
                    if(notFullTimeoutRetryCnt > this.notFullTimeoutRetryCount || this.isFull()) {
                        throw var23;
                    }

                    ++notFullTimeoutRetryCnt;
                    if(LOG.isWarnEnabled()) {
                        LOG.warn("get connection timeout retry : " + notFullTimeoutRetryCnt);
                    }
                }
            }

            if(this.testOnBorrow) { 
            	//如果testOnBorrow=true,每次獲取連接時都會檢查連接有效性.效率較差
                boolean validate = this.testConnectionInternal(poolableConnection.holder, poolableConnection.conn);
                if(validate) {
                    break;
                }

                if(LOG.isDebugEnabled()) {
                    LOG.debug("skip not validate connection.");
                }
				//銷燬連接(檢查有效性結果:無效)
                this.discardConnection(poolableConnection.holder);
            } else if(poolableConnection.conn.isClosed()) {
            	//如果連接已經關閉,銷燬
                this.discardConnection(poolableConnection.holder);
            } else {
            	//如果testWhileIdle=false,break;不執行一下校驗,連接被返回
                if(!this.testWhileIdle) {
                    break;
                }

                DruidConnectionHolder holder = poolableConnection.holder;
                long currentTimeMillis = System.currentTimeMillis();
                long lastActiveTimeMillis = holder.lastActiveTimeMillis;
                long lastExecTimeMillis = holder.lastExecTimeMillis;
                long lastKeepTimeMillis = holder.lastKeepTimeMillis;
                if(this.checkExecuteTime && lastExecTimeMillis != lastActiveTimeMillis) {
                    lastActiveTimeMillis = lastExecTimeMillis;
                }

                if(lastKeepTimeMillis > lastActiveTimeMillis) {
                    lastActiveTimeMillis = lastKeepTimeMillis;
                }

                long idleMillis = currentTimeMillis - lastActiveTimeMillis;
                long timeBetweenEvictionRunsMillis = this.timeBetweenEvictionRunsMillis;
                if(timeBetweenEvictionRunsMillis <= 0L) {
                	//默認60000ms,即1分鐘
                    timeBetweenEvictionRunsMillis = 60000L;
                }
				
				/*
				1.如果連接空閒時間 < timeBetweenEvictionRunsMillis時間
				2.連接空閒時間 > 0
				不校驗,也就是說我們如果通過開啓testWhileIdle參數
				校驗連接有效性的話timeBetweenEvictionRunsMillis 時間一定不能超過8小時,
				不然依然可能取到失效鏈接.
				*/
                if(idleMillis < timeBetweenEvictionRunsMillis && idleMillis >= 0L) {
                    break;
                }
				
				//執行校驗 (即 : validationQuery中配置的'select 1 from dual'語句)
                boolean validate = this.testConnectionInternal(poolableConnection.holder, poolableConnection.conn);
                if(validate) {
                    break;
                }

                if(LOG.isDebugEnabled()) {
                    LOG.debug("skip not validate connection.");
                }

                this.discardConnection(poolableConnection.holder);
            }
        }

        if(this.removeAbandoned) {
            StackTraceElement[] stackTrace = Thread.currentThread().getStackTrace();
            poolableConnection.connectStackTrace = stackTrace;
            poolableConnection.setConnectedTimeNano();
            poolableConnection.traceEnable = true;
            this.activeConnectionLock.lock();

            try {
                this.activeConnections.put(poolableConnection, PRESENT);
            } finally {
                this.activeConnectionLock.unlock();
            }
        }

        if(!this.defaultAutoCommit) {
            poolableConnection.setAutoCommit(false);
        }

        return poolableConnection;
    }

第二種方式 : Destroy 定時任務檢查需要被回收的連接

	//init方法是線程池創建方法
	public void init() throws SQLException {
		...
		this.createAndLogThread();
	    this.createAndStartCreatorThread();
	    //調用創建銷燬線程方法
	    this.createAndStartDestroyThread();
	    this.initedLatch.await();
		...
	}
	//創建銷燬線程
	protected void createAndStartDestroyThread() {
        this.destroyTask = new DruidDataSource.DestroyTask();
        if(this.destroyScheduler != null) {
            long period = this.timeBetweenEvictionRunsMillis;
            if(period <= 0L) {
                period = 1000L;
            }
			//啓動銷燬線程
            this.destroySchedulerFuture = this.destroyScheduler.scheduleAtFixedRate(this.destroyTask, period, period, TimeUnit.MILLISECONDS);
            this.initedLatch.countDown();
        } else {
            String threadName = "Druid-ConnectionPool-Destroy-" + System.identityHashCode(this);
            this.destroyConnectionThread = new DruidDataSource.DestroyConnectionThread(threadName);
            this.destroyConnectionThread.start();
        }
    }
    //銷燬線程
    public class DestroyTask implements Runnable {
       public DestroyTask() {
       }

       public void run() {
       	   //checkTime 爲true
           DruidDataSource.this.shrink(true, DruidDataSource.this.keepAlive);
           if(DruidDataSource.this.isRemoveAbandoned()) {
               DruidDataSource.this.removeAbandoned();
           }

       }
   }
   //具體方法
   public void shrink(boolean checkTime, boolean keepAlive) {
        try {
            this.lock.lockInterruptibly();
        } catch (InterruptedException var49) {
            return;
        }

        boolean needFill = false;
        int evictCount = 0;
        int keepAliveCount = 0;
        int fatalErrorIncrement = this.fatalErrorCount - this.fatalErrorCountLastShrink;
        this.fatalErrorCountLastShrink = this.fatalErrorCount;

        int checkCount;
        label956: {
            try {
                if(this.inited) {
                	//可能被銷燬數量 = 線程池當前線程數量 - 配置的最小空閒數
                    checkCount = this.poolingCount - this.minIdle;
                    long currentTimeMillis = System.currentTimeMillis();

                    int i;
                    for(i = 0; i < this.poolingCount; ++i) {
                        DruidConnectionHolder connection = this.connections[i];
                        if((this.onFatalError || fatalErrorIncrement > 0) && this.lastFatalErrorTimeMillis > connection.connectTimeMillis) {
                            this.keepAliveConnections[keepAliveCount++] = connection;
                        } else if(checkTime) {
                            long idleMillis;
                            if(this.phyTimeoutMillis > 0L) {
                                idleMillis = currentTimeMillis - connection.connectTimeMillis;
                                if(idleMillis > this.phyTimeoutMillis) {
                                    this.evictConnections[evictCount++] = connection;
                                    continue;
                                }
                            }
							//當前for循環處理的線程空閒時間 = 當前時間 - 連接最後活躍時間
                            idleMillis = currentTimeMillis - connection.lastActiveTimeMillis;
                            if(idleMillis < this.minEvictableIdleTimeMillis && idleMillis < this.keepAliveBetweenTimeMillis) {
                                break;
                            }
							//連接空閒時間 >= 配置的最小空閒被回收時間 : minEvictableIdleTimeMillis
                            if(idleMillis >= this.minEvictableIdleTimeMillis) {
                            	/*
                            	checkTime 爲方法入參 = true, 
                            	i:當前for循環下標(連接取得時候是取得數組最大座標,新創建的連接也是放在數組最大座標上,所以0號座標一定是最久未使用的那個)
                            	可能被銷燬數量 = 線程池當前線程數量 - 配置的最小空閒數(checkCount = this.poolingCount - this.minIdle; )
                            	重點 : 也就是minEvictableIdleTimeMillis配置只會回收超過minIdle的那部分空閒連接
                          		*/
                                if(checkTime && i < checkCount) {
                                    this.evictConnections[evictCount++] = connection;
                                    continue;
                                }
								
								//連接空閒時間 > 配置的最大空閒時間maxEvictableIdleTimeMillis
								//重點 : maxEvictableIdleTimeMillis參數會忽略配置的minIdle
                                if(idleMillis > this.maxEvictableIdleTimeMillis) {
                                    this.evictConnections[evictCount++] = connection;
                                    continue;
                                }
                            }

                            if(keepAlive && idleMillis >= this.keepAliveBetweenTimeMillis) {
                                this.keepAliveConnections[keepAliveCount++] = connection;
                            }
                        } else {
                            if(i >= checkCount) {
                                break;
                            }

                            this.evictConnections[evictCount++] = connection;
                        }
                    }

                    i = evictCount + keepAliveCount;
                    if(i > 0) { //複製有效連接到連接池數組
                        System.arraycopy(this.connections, i, this.connections, 0, this.poolingCount - i);
                        Arrays.fill(this.connections, this.poolingCount - i, this.poolingCount, (Object)null);
                        this.poolingCount -= i;
                    }

                    this.keepAliveCheckCount += keepAliveCount;
                    if(keepAlive && this.poolingCount + this.activeCount < this.minIdle) {
                        needFill = true;
                    }
                    break label956;
                }
            } finally {
                this.lock.unlock();
            }

            return;
        }

        Connection connection;
        DruidConnectionHolder holer;
        if(evictCount > 0) { //銷燬剛剛放在數組裏的連接
            for(checkCount = 0; checkCount < evictCount; ++checkCount) {
                holer = this.evictConnections[checkCount];
                connection = holer.getConnection();
                JdbcUtils.close(connection);
                destroyCountUpdater.incrementAndGet(this);
            }

            Arrays.fill(this.evictConnections, (Object)null);
        }
        ...
    }

總結(線程回收幾種配置方法)

1: 通過配置testOnBorrow=true每次在連接取出時判斷, 效率較差
2: 通過配置testWhileIdle=true每次在連接取出時且取出的連接空閒時間超過timeBetweenEvictionRunsMillis時判斷,效率較高. 但要注意timeBetweenEvictionRunsMillis的時間一定不能超過8個小時(mysql 自動釋放連接時間)
3: 通過配置 timeBetweenEvictionRunsMillisminEvictableIdleTimeMillis定時任務掃空閒線程,超過minEvictableIdleTimeMillis空閒時間的被回收. 缺點:只能回收超出minIdle配置的連接. 另外如果minIdlemaxActive的話, 此方法無效, 相當於沒有配置
4: 通過配置timeBetweenEvictionRunsMillismaxEvictableIdleTimeMillis 作爲第3種方案的後補方案,但注意timeBetweenEvictionRunsMillis+maxEvictableIdleTimeMillis一定不能>8小時

參考資料

[1]: 《億級流量網站架構核心技術》 - 張開濤
[2]: druid1.0.21版本源碼研究之連接回收(分析解決mysql8小時斷線)
[3]: Druid配置參數詳解-maxEvictableIdleTimeMillis,minEvictableIdleTimeMillis

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章