netty源碼分析7-NioEventLoop-run方法疑難點

本文分享內容如下

select()和空輪詢bug解決分析
EventLoop 中對selectKeys的改造
wakeup分析

select()和空輪詢bug解決分析

當select空輪詢（ selector.select(timeoutMillis); 未等待 timeoutMillis）執行次數達到SELECTOR_AUTO_REBUILD_THRESHOLD（默認512）時重新創建 selector, 並註冊所有的channel和關注的事件。

private void select() throws IOException {

Selector selector = this.selector;

try {

int selectCnt = 0;

long currentTimeNanos = System.nanoTime();

//delayNanos（）獲取即將執行的定時任務距離要執行的時間納秒差值，沒有獲取到返回默認值1000ms

long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos);

for (;;) {

// 因爲EventLoop 要同時 select IO事件和執行任務，不能一直阻塞，當超出期限時間後，就跳出select（）,執行任務。

long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;

// 假設timeoutMillis 1000ms ,經過一次或多次循環後執行時間超出1000ms，則退出select循環。(註釋A)

if (timeoutMillis <= 0) {

if (selectCnt == 0) {

selector.selectNow();

selectCnt = 1;

}

break;//code B }

//如果查詢到IO事件會正常跳出循環，或者按照timeoutMillis時長阻塞後 code B 跳出循環，否則就是發生了空輪詢。

int selectedKeys = selector.select(timeoutMillis);

selectCnt ++;

//有IO事件，被喚醒，有需要執行的任務都跳出循環

if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks()) {

// Selected something,

// waken up by user, or

// the task queue has a pending task.

break;

}

//解決NIO selector 空輪詢的bug。註釋A 中的處理，當selectCnt數量過大，一定是selector.select(timeoutMillis) 中阻塞功能失效，發生了空輪詢，當空輪詢數過多時，爲了防止空輪詢 CPU達到100%，重建selector

if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&

selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {

// The selector returned prematurely many times in a row.

// Rebuild the selector to work around the problem.

logger.warn(

"Selector.select() returned prematurely {} times in a row; rebuilding selector.",

selectCnt);

//重新創建 selector, 並註冊所有的channel和關注的事件

rebuildSelector();

selector = this.selector;

// Select again to populate selectedKeys.

selector.selectNow();

selectCnt = 1;

break;

}

currentTimeNanos = System.nanoTime();

}

if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS) {

if (logger.isDebugEnabled()) {

logger.debug("Selector.select() returned prematurely {} times in a row.", selectCnt - 1);

}

} catch (CancelledKeyException e) {

if (logger.isDebugEnabled()) {

logger.debug(CancelledKeyException.class.getSimpleName() + " raised by a Selector - JDK bug?", e);

}

// Harmless exception - log anyway

}

rebuildSelector分析

public void rebuildSelector() {

if (!inEventLoop()) {

execute(new Runnable() {

@Override

public void run() {

rebuildSelector();

}

});

return;

}

final Selector oldSelector = selector;

final Selector newSelector;

if (oldSelector == null) {

return;

}

try {

newSelector = openSelector();

} catch (Exception e) {

logger.warn("Failed to create a new Selector.", e);

return;

}

// Register all channels to the new Selector.

int nChannels = 0;

for (;;) {

try {

for (SelectionKey key: oldSelector.keys()) {

Object a = key.attachment();

try {

if (key.channel().keyFor(newSelector) != null) {

continue;

}

int interestOps = key.interestOps();

key.cancel();

key.channel().register(newSelector, interestOps, a);

nChannels ++;

} catch (Exception e) {

logger.warn("Failed to re-register a Channel to the new Selector.", e);

if (a instanceof AbstractNioChannel) {

AbstractNioChannel ch = (AbstractNioChannel) a;

ch.unsafe().close(ch.unsafe().voidPromise());

} else {

@SuppressWarnings("unchecked")

NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a;

invokeChannelUnregistered(task, key, e);

}

} catch (ConcurrentModificationException e) {

// Probably due to concurrent modification of the key set.

continue;

}

break;

}

selector = newSelector;

try {

// time to close the old selector as everything else is registered to the new one

oldSelector.close();

} catch (Throwable t) {

if (logger.isWarnEnabled()) {

logger.warn("Failed to close the old Selector.", t);

}

logger.info("Migrated " + nChannels + " channel(s) to the new Selector.");

}

rebuildSelector整體的邏輯比較清晰，

先創建Selector ，將原來的 channel,interestOps,attachment 註冊到新的Selector 上，然後關閉舊的Selector。

EventLoop 中對selectKeys的改造

selectedKeys是一個 SelectedSelectionKeySet 類對象，

每次在輪詢到nio事件的時候，netty只需要O(1)的時間複雜度就能將 SelectionKey 塞到 set中去，而jdk底層使用的hashSet需要O(lgn)的時間複雜度
優化過的 SelectedSelectionKeySet 的好處，遍歷的時候遍歷的是數組，相對jdk原生的HashSet效率有所提高

SelectedSelectionKeySet

當IO事件發生了一定是調用了add方法，這裏只需要O(1)的時間複雜度。

public boolean add(SelectionKey o) {

if (o == null) {

return false;

}

if (isA) {

int size = keysASize;

keysA[size ++] = o;

keysASize = size;

if (size == keysA.length) {

doubleCapacityA();

}

} else {

int size = keysBSize;

keysB[size ++] = o;

keysBSize = size;

if (size == keysB.length) {

doubleCapacityB();

}

return true;

}

add 根據isA 判斷使用哪個數組，實際上 keysA，keysB 這個兩個數組是輪流使用的。

SelectionKey[] flip() {

if (isA) {

isA = false;

keysA[keysASize] = null;//因爲數組存在複用，按照add的邏輯 keysASize位置應該是無效的

keysBSize = 0;//翻轉前將另一個數組的添加位置賦值爲0

return keysA;

} else {

isA = true;

keysB[keysBSize] = null;

keysASize = 0;

return keysB;

}

isA ,filp() 都是爲使用2個數組而設計的。

filp這樣設計原本是處於高併發，一致性的考慮，在高併發的情況下如果只有一個數組存儲SelectKey, 這個數組會一直增長，假設數組沒有併發問題，線程會一直處理IO事件，IO任務就一直得不到處理，而數組的修改是有併發問題的，添加進來的SelectKey有可能不會被及時的處理而跳過，而使用兩個數組，一個用於添加SelectKey，一個用於SelectKey的分發執行。這樣做是巧妙的辦法，而新的版本中已經改爲一個數組了，作者描述：一個數組雖然有一致性的問題，但是分發執行的時候小心使用可以解決這個問，如傳遞一個定長的size。

該問題官方描述：https://github.com/netty/netty/issues/6058#

wakeup分析

NioEventLoop run方法負責輪詢IO事件和執行IO任務,這裏簡稱爲IO輪詢方法.

IO輪詢方法中有wakeup 的處理，還有wakeup好多的註釋，花了我3個多小時，終於研究明白

通過分析原文註釋和實驗分析得知使用selelct.wakeup（）效果如下：

先執行selelct的還沒返回的操作立即返回。

如果沒有執行selelct，則下一次阻塞的 select() select(long timeout) 會立即返回

selectNow(), select() select(long timeout)都會清除 wakeup狀態，不會影響下次 select() select(long timeout)的阻塞。

NioEventLoop向外暴露的wakeup方法

protected void wakeup(boolean inEventLoop) {

if (!inEventLoop && wakenUp.compareAndSet(false, true)) {

selector.wakeup();

}

這裏根據inEventLoop進行判斷，也就是說只有初次啓動，或非EventLoop線程的纔有可能修改wakenUp，並執行selector.wakeup();

調用場景

SingleThreadEventExecutor-execute()

public void execute(Runnable task) {

//...

boolean inEventLoop = inEventLoop();

if (inEventLoop) {

addTask(task);

} else {

startThread();

addTask(task);

if (isShutdown() && removeTask(task)) {

reject();

}

if (!addTaskWakesUp) {//addTaskWakesUp 默認是false

wakeup(inEventLoop);

}

熟悉吧？就是在啓動EventLoop或提交IO任務時候會調用wakeup()。爲啥要就這樣搞這裏先留個疑問設爲問題1

結合IO輪詢方法分析，如下

protected void run() {

for (;;) {

oldWakenUp = wakenUp.getAndSet(false);

try {

if (hasTasks()) {

selectNow();

} else {

select();

//源代碼有很多註釋，難以讀懂設爲問題2

if (wakenUp.get()) {

selector.wakeup();

}

//....

}

注意：selector的喚醒都是調用 NioEventLoop.wakeup()

問題1 IO輪詢方法中 IO事件IO任務循環順序執行，如果用戶線程提交IO任務，而IO輪詢方法所在線程由於沒有IO事件，一直阻塞在select(long timeout)中，就影響了用戶線程IO任務的執行, 所以需要執行selector.wakeup來停止阻塞，執行用戶線程。

問題2 既然要執行selector.wakeup，那麼 IO輪詢過程中處於阻塞狀態中執行是最有用的。分析IO輪詢方法會出現2種不理想情況

selector.wakeup在 wakenUp.getAndSet(false) 和 select(long timeout)之間執行
selector.wakeup在 select(long timeout)和 if (wakenUp.get()){。。。}之間執行

情況2 下次執行select(long timeout)不會阻塞，算是儘量滿足減少阻塞時間的需求。

情況1 由於執行了select(long timeout)後立即返回，導致selector 的wakeup狀態復原，在這個期間，後續執行 NioEventLoop.wakeup()不會調用成功,希望減少阻塞的目標沒有達成，因此需要儘可能的完成目標。

if (wakenUp.get()) {

selector.wakeup();

}

這個儘可能的減少阻塞事件的處理有問題，如果情況2沒有發生，會多執行了一次selector.wakeup();，猜測netty作者是經過權衡，才這麼做的。

IO輪詢方法中 IO事件IO任務按照配置好的時間比例執行，默認 50比50。selector.wakeup的運用是對此的優化。深究無用，理解到此就可以了。

netty源碼分析7-NioEventLoop-run方法疑難點

select()和空輪詢bug解決分析

EventLoop 中對selectKeys的改造

wakeup分析

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU啓動那些事（12.A）- uSDHC eMMC啓動時間(RT1170)

本地SSL證書過期輸入命令在IIS自動生成

redis5.0源碼淺析5-跳躍表skiplist

JUC源碼分析-重入鎖-Reetrantlock

JUC源碼分析-ScheduledThreadPoolExecutor

JUC源碼分析-容器-LinkedBlockingQueue和ArrayBlockingQueue

JUC源碼分析-容器-List和set

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結