netty5筆記-線程模型2-EventLoopGroup

閱讀本文之前，你需要對java的線程池有一定的瞭解，因爲這裏不會過多的講解。

今天我們主要的任務就是看下netty中一個非常重要的類EventLoop，通過這系列文章，你應該瞭解EventLoop適用的場景，不會濫用它而導致你的應用緩慢。Netty使用了典型的Reactor模型結構，這其中一個很重要的角色就是EventLoop，它使用循環的方式來處理IO或者其他事件。

上圖是EventLoop的接口繼承關係，其中Executor、ExecutorService、ScheduledExecutorService是java提供的線程池管理接口：

ScheduledExecutorService：提供執行計劃任務的接口；

EventExecutorGroup：提供管理EventExecutor的能力，他通過next()來爲任務分配執行線程，同時也提供了shutdownGracefully這一優雅下線的接口；

方法	說明
shutdownGraceFully	優雅關閉（何爲優化關閉後面會介紹）
isShuttingDown	其管理的所有EventExecutor是否關閉
terminationFuture	返回接收該線程池徹底關閉事件的Future
children	含所有管理的EventExecutor
next	通過該方法來爲任務分配一個EventExecutor

EventExecutor：實際的事件執行者

方法	說明
parent	管理它的EventExecutorGroup
isEventLoop	當前線程與EventExecutor的執行線程是否是同一個線程，如果是則此處返回true
newPromise	創建一個Promise，由該EventExecutor來執行Promise中的listener

EventLoopGroup: EventLoopGroup和EventLoop的關係與EventExecutorGroup和EventExecutor的關係類似

方法	說明
register	將channel註冊到該EventLoopGroup，註冊後EventLoop會負責該channel的相關io事件

EventLoop：處理所有的IO操作。EventLoop繼承了EventLoopGroup接口，可以被當做一個single的線程池看到（雖然模式差不多，但其實和java的single線程池區別很大）。

我們找一個最常用的EventLoop實現類來介紹：NioEventLoop。介紹它之前我們得先介紹NioEventLoopGroup，一個連接被它分配到對應的NioEventLoop並進行一系列的後續操作。先看看NioEventLoopGroup的構造函數，最終調用的是下面這個構造方法：

       private MultithreadEventExecutorGroup(int nEventExecutors,
                                          Executor executor,
                                          boolean shutdownExecutor,
                                          Object... args) {
       if (nEventExecutors <= 0) {
            throw new IllegalArgumentException(
                    String.format("nEventExecutors: %d (expected: > 0)", nEventExecutors));
        }

        if (executor == null) {
            executor = newDefaultExecutorService(nEventExecutors);
            shutdownExecutor = true;
        }

        // 根據nEventExecutors確定EventExecutor的數量
        children = new EventExecutor[nEventExecutors];
        // 用了兩種不同的方式來爲一個任務分配EventExecutor,。
        // 兩種實現結果是相同的，但是第一種利用的位運算，相對效率更高點。。。
        // 具體的實現是從children的第一個開始獲取，從0->size-1依次取child，到達最後一個後回到第一個child，最終形成一個環形數組。
        if (isPowerOfTwo(children.length)) {
            chooser = new PowerOfTwoEventExecutorChooser();
        } else {
            chooser = new GenericEventExecutorChooser();
        }

        // 開始初始化每個EventExecutor
        for (int i = 0; i < nEventExecutors; i ++) {
            boolean success = false;
            try {
                // 實際的初始化由子類自己實現，如NioEventLoopGroup的實現爲：
                // return new NioEventLoop(this, executor, (SelectorProvider) args[0]);
                children[i] = newChild(executor, args);
                success = true;
            } catch (Exception e) {
                // TODO: Think about if this is a good exception type
                throw new IllegalStateException("failed to create a child event loop", e);
            } finally {
                // 如果初始化的過程中發生異常，則將初始化好的EventExecutor全部關閉
                if (!success) {
                    for (int j = 0; j < i; j ++) {
                        children[j].shutdownGracefully();
                    }

                    for (int j = 0; j < i; j ++) {
                        EventExecutor e = children[j];
                        // 等待關閉完成
                        try {
                            while (!e.isTerminated()) {
                                e.awaitTermination(Integer.MAX_VALUE, TimeUnit.SECONDS);
                            }
                        } catch (InterruptedException interrupted) {
                            // Let the caller handle the interruption.
                            Thread.currentThread().interrupt();
                            break;
                        }
                    }
                }
            }
        }

        final boolean shutdownExecutor0 = shutdownExecutor;
        final Executor executor0 = executor;
        final FutureListener<Object> terminationListener = new FutureListener<Object>() {
            @Override
            public void operationComplete(Future<Object> future) throws Exception {
                // 最後一個關閉完成則標記future完成
                if (terminatedChildren.incrementAndGet() == children.length) {
                    terminationFuture.setSuccess(null);
                    if (shutdownExecutor0) {
                        // This cast is correct because shutdownExecutor0 is only try if
                        // executor0 is of type ExecutorService.
                        ((ExecutorService) executor0).shutdown();
                    }
                }
            }
        };

        // 下面的代碼比較簡單，不過多介紹
        xxxxxxxxxxxxxxxxxxxxxxxx
    }

    // 看看這兩個實現類的差異，這效率扣得不要不要的啊！
    private final class PowerOfTwoEventExecutorChooser implements EventExecutorChooser {
        @Override
        public EventExecutor next() {
            return children[childIndex.getAndIncrement() & children.length - 1];
        }
    }

    private final class GenericEventExecutorChooser implements EventExecutorChooser {
        @Override
        public EventExecutor next() {
            return children[Math.abs(childIndex.getAndIncrement() % children.length)];
        }
    }

我們在使用NioEventGroupLoop的時候，一般都是直接使用默認構造方法，此時第一個參數nEventExecutors=cpu核數 x 2。NioEventGroupLoop中有很大部分的io操作，這個默認值比較靠譜，不用用戶再去修改。

    private static final int DEFAULT_EVENT_LOOP_THREADS;

    static {
        DEFAULT_EVENT_LOOP_THREADS = Math.max(1, SystemPropertyUtil.getInt(
                "io.netty.eventLoopThreads", Runtime.getRuntime().availableProcessors() * 2));

        if (logger.isDebugEnabled()) {
            logger.debug("-Dio.netty.eventLoopThreads: {}", DEFAULT_EVENT_LOOP_THREADS);
        }
    }

構造方法的第二個參數executor，它是執行EventExecutor中的任務的實際線程池。默認使用的是netty實現的ForkJoinPool（比較複雜，有空再回過頭來分析）。可以看出NioEventLoop本身是不負責線程的創建銷燬的，他把執行邏輯封裝在Runnable中交給executor處理，這裏的模型和netty4已經不太一樣，4的EventLoop對應一個固定線程，而5的EventLoop並未固定到一個線程。這也是我困惑的地方，executor的線程數與EventLoop個數相同，能保證每個EventLoop都有線程去執行，但是每個EventLoop不再是固定的Thread了，它帶來的問題是一些ThreadLocal的cache可能會失效。不知道爲何會這樣設計，先在這留個疑問吧，等release版本出來了再看看。

構造方法的最後一個參數args[0]=SelectorProvider.provider(); SelectorProvider根據不同的操作系統創建出對應的provider，如linux下創建的是sun.nio.ch.EPollSelectorProvider。該參數在NioEventLoop初始化的時候被傳入，用於創建Selector（這裏有一篇Selector的介紹）。

往EventLoopGroup中提交一個任務，實際上就是交給其child(即EventLoop)處理：

    public <T> Future<T> submit(Callable<T> task) {
        return next().submit(task);
    }
    public EventExecutor next() {
        return chooser.next();
    }

        // next方法的其中一個實現，這樣可以保證每個child中處理的連接數基本相同
        public EventExecutor next() {
            return children[childIndex.getAndIncrement() & children.length - 1];
        }

EventLoopGroup提供了一個註冊Channel(表示一個連接）的接口：

    public ChannelFuture register(Channel channel) {
        return next().register(channel);
    }

就一句代碼，背後隱藏的信息卻非常重要：連接在註冊時就綁定了一個固定的EventLoop，綁定的方式爲將channel註冊到EventLoop所在的Selector，此連接的讀寫及其他相關操作（如編碼解碼、超時管理）都交給了這個EventLoop；因此正常情況下一個連接涉及到的方法（如讀/寫/編解碼/超時管理）都在一個EventLoop中進行，意味着所有這些操作都是線程安全的。還記得netty3嗎，超時管理是交給HashedWheelTimer進行管理的，由於超時任務和讀寫任務是在不同的線程執行，如果超時的同時讀入或寫出數據也到達，可能會產生非預期的效果。然而由於線程模型的修改，雖然能保證線程安全，但卻不再保證這些操作都在一個線程裏，這種情況下ThreadLocal的相關功能可能會失效。

這裏又引申出一個問題，由於一個連接是和一個EventLoop綁定的，如果EventLoop中存在一個執行時間很長的任務，那該EventLoop後續的所有連接都會被hold住得不到處理，因此應用不要在handler中添加會阻塞或者執行時間很長的操作。網上看到很多文章說是把業務操作放到io線程裏去執行，導致系統處理慢甚至掛掉，希望看了這篇文章的同學不要犯同樣的錯。

   /**
     * 設置I/O操作在EventLoop中佔的時間比，(0-100)，默認爲50，即執行I/O的時間與非I/O的時間相同
     */
    public void setIoRatio(int ioRatio) {
        for (EventExecutor e: children()) {
            ((NioEventLoop) e).setIoRatio(ioRatio);
        }
    }

    /**
     * 當epoll佔用100%時（早期jdk的bug，不知道現在徹底處理了沒），使用此方法來用新的Selector替代老的EventLoop中的Selector
     */
    public void rebuildSelectors() {
        for (EventExecutor e: children()) {
            ((NioEventLoop) e).rebuildSelector();
        }
    }

好了，到這裏NioEventLoopGroup的基本功能就這麼多，實現不復雜，但初始化使用的幾個默認值需要關注。看了這裏你是不是大概明白了上一篇文章中那幅圖的意思了。不過我還是準備強調下這幾點（重要的事情多說一遍）：

1、NioEventLoopGroup下默認的NioEventLoop個數爲cpu核數 * 2，因爲有很多的io處理；

2、NioEventLoop和java的single線程池在5裏差異變大了，它本身不負責線程的創建銷燬，而是由外部傳入的線程池管理。後面的文章會介紹，他的處理邏輯都封裝到Runnable中了;

3、channel和EventLoop是綁定的，即一旦連接被分配到EventLoop，其相關的I/O、編解碼、超時處理都在同一個EventLoop中，這樣可以確保這些操作都是線程安全的，而不像netty3中可能會出現非預期的執行結果。但和netty4不同的是netty5中並不能保證一個連接的所有操作在同一個線程中，因此和ThreadLocal相關的功能可能會失效（比如內存池的PoolThreadCache在這種情況下無法達到最佳效果)。

補充：

關於executor的引入的一個討論：https://github.com/netty/netty/issues/2250，從這個問題中，我們可以大概瞭解爲什麼netty5要這麼改：

1、希望能留給用戶更多空間來定製化I/O的執行

2、希望用到fork/jion框架的stealing機制，避免因個別連接問題導致整個任務鏈阻塞。目前還在思考如何修改netty的架構來達到這個目的。

3、目前的默認實現能夠保證效率和之前的一樣，能保證線程安全，但內存池之類的效率受到了挑戰，這個也是這次改動需要考慮的。

如果最終這個改動成功，那麼netty可能的變化：

1、整個線程模型改變；

2、用戶可以自定義線程池的實現；

3、內存池的相應修改；

4、一個連接的阻塞不會影響其他連接（如果大量連接阻塞就沒辦法了）；

5、有可能可以直接在netty的線程池中執行長任務（執行時間長），而不用對業務處理單獨開連接池。

6、一個連接的操作會保證線程安全，但不一定是在同一個線程中執行，因此如果有在I/O handler中使用ThreadLocal的同學，可以提前想下應對方案。

...等等...

想想有點小激動，不過挑戰挺多的，是一次很大的底層改動。仰望大神！(遺憾的是5.0已經被幹掉了！！！見 https://github.com/netty/netty/issues/4466)

陽二快跑

發佈了29 篇原創文章 · 獲贊 17 · 訪問量 19萬+

私信關注

netty5筆記-線程模型2-EventLoopGroup

Wireshark 安裝+使用（一）

netty5學習筆記-內存泄露檢測

netty5學習筆記-內存池1-PoolChunk

netty5筆記-線程模型2-EventLoopGroup

netty5學習筆記-內存池6-可調優參數

netty5學習筆記-內存池2-PoolSubpage

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結