Netty学习笔记(三)EventLoopGroup开篇

 

使用Netty都需要定义EventLoopGroup,也就是线程池

前面讲过在客户端只需要一个EventLoopGroup就够了,而在服务端就需要两个Group--bossGroup和workerGroup,这与Netty的线程模型有关,使用的是主从Reactor多线程模型 ,两个线程池,一个用于监听端口,创建新连接(boosGroup),一个用于处理每一条连接的数据读写和业务逻辑(workerGroup)

以下的代码里都去掉了一些try...catch和非核心代码,只保留了主要的代码流程

EventLoopGroup初始化

其类图如下所示:

可以发现EventLoopGroup都实现了ScheduledExecutorService,本质是一个带有schedule的线程池
NioEventLoopGroup有很多重载的构造方法,最后都调用了如下方法:

public NioEventLoopGroup(int nThreads, ThreadFactory threadFactory,
        final SelectorProvider selectorProvider, final SelectStrategyFactory selectStrategyFactory) {
        super(nThreads, threadFactory, selectorProvider, selectStrategyFactory, RejectedExecutionHandlers.reject());
    }

调用其父类MultithreadEventLoopGroup的构造方法: 

private static final int DEFAULT_EVENT_LOOP_THREADS;

    static {
        DEFAULT_EVENT_LOOP_THREADS = Math.max(1, SystemPropertyUtil.getInt(
                "io.netty.eventLoopThreads", Runtime.getRuntime().availableProcessors() * 2));
    } 
protected MultithreadEventLoopGroup(int nThreads, ThreadFactory threadFactory, Object... args) {
        super(nThreads == 0 ? DEFAULT_EVENT_LOOP_THREADS : nThreads, threadFactory, args);
    }

这里会判断当前nThreads是否为0,如果为0的话则使用默认的Threads数,其实就是处理器核心数*2 ,我的demo里都没有指定线程数,那么最终生成的EventLoopGroup的线程数就处理器核心数*2

再跟踪下去,最后会调用MultithreadEventExecutorGroup的如下构造方法

 protected MultithreadEventExecutorGroup(int nThreads, Executor executor,
                                            EventExecutorChooserFactory chooserFactory, Object... args) {
        if (executor == null) {
            executor = new ThreadPerTaskExecutor(newDefaultThreadFactory());
        }
        children = new EventExecutor[nThreads];
        for (int i = 0; i < nThreads; i ++) {
            boolean success = false;
            try {
                children[i] = newChild(executor, args);
                success = true;
        }

        chooser = chooserFactory.newChooser(children);
    }

上面的代码会先创建一个executor,然后再初始化一个EventExecutor数组(长度就是nThreads),然后调用newChild对每个元素进行初始化,然后调用newChooser方法创建一个chooser

先看下这里的executor的创建,其实就是创建一个Executor的实例对象,对于execute传入的command,都会创建一个线程并启动来执行,线程id为poolName + '-' + poolId.incrementAndGet() + '-'+ nextId.incrementAndGet()

public final class ThreadPerTaskExecutor implements Executor {
    private final ThreadFactory threadFactory;

    public ThreadPerTaskExecutor(ThreadFactory threadFactory) {
        this.threadFactory = threadFactory;
    }

    @Override
    public void execute(Runnable command) {
        threadFactory.newThread(command).start();
    }
}

这里的newChild方法,就是实例化一个 NioEventLoop 对象, 并返回,所以EventLoopGroup里的每一个元素都是NioEventLoop,源码如下:

 @Override
    protected EventLoop newChild(Executor executor, Object... args) throws Exception {
        return new NioEventLoop(this, executor, (SelectorProvider) args[0],
            ((SelectStrategyFactory) args[1]).newSelectStrategy(), (RejectedExecutionHandler) args[2]);
    }

看下这里NioEventLoop的类图:注意下这里的NioEventLoop是实现了SingleThreadEventExecutor,参数Executor最后也会保存在该类的executor属性字段里

接下来看下newChooser方法的实现 : 如果executor,length是2的幂次其实就是nThreads是2的幂次,那么就会使用PowerOfTowEventExecutorChooser来进行选择,否则就使用普通的选择器

   public EventExecutorChooser newChooser(EventExecutor[] executors) {
        if (isPowerOfTwo(executors.length)) {
            return new PowerOfTowEventExecutorChooser(executors);
        } else {
            return new GenericEventExecutorChooser(executors);
        }
    }

 private static boolean isPowerOfTwo(int val) {
        return (val & -val) == val;
    }

两个选择器实现的区别在于获取下一个EventExecutor的方法next(),普通选择器是对idx递增后对nThreads取模
PowerOfTow实现的也是这个逻辑,只不过使用了位运算符,运算速度更快

private static final class PowerOfTowEventExecutorChooser implements EventExecutorChooser {
        private final AtomicInteger idx = new AtomicInteger();
        private final EventExecutor[] executors;
        PowerOfTowEventExecutorChooser(EventExecutor[] executors) {
            this.executors = executors;
        }
        @Override
        public EventExecutor next() {
            return executors[idx.getAndIncrement() & executors.length - 1];
        }
    }

    private static final class GenericEventExecutorChooser implements EventExecutorChooser {
        private final AtomicInteger idx = new AtomicInteger();
        private final EventExecutor[] executors;
        GenericEventExecutorChooser(EventExecutor[] executors) {
            this.executors = executors;
        }
        @Override
        public EventExecutor next() {
            return executors[Math.abs(idx.getAndIncrement() % executors.length)];
        }
    }

总结下EventLoopGroup的初始化:

  • EventLoopGroup的父类MultithreadEventExecutorGroup内部维护一个类型为 EventExecutor的 线程数组, 其大小是 nThreads
  • 如果实例化NioEventLoopGroup 时,没有指定默认值nThreads就等于处理器*2
  • MultithreadEventExecutorGroup 中通过newChild()抽象方法来初始化 children 数组,每个元素都是NioEventLoop
  • 根据nThreads数选择不同的chooser

EventLoopGroup执行

在ServerBootstrap 初始化时,调用了serverBootstrap.group(bossGroup,workerGroup)设置了两个EventLoopGroup,我们跟
踪进去以后会看到:

 public ServerBootstrap group(EventLoopGroup parentGroup, EventLoopGroup childGroup) {
        super.group(parentGroup);
        if (childGroup == null) {
            throw new NullPointerException("childGroup");
        }
        if (this.childGroup != null) {
            throw new IllegalStateException("childGroup set already");
        }
        this.childGroup = childGroup;
        return this;
    }

这个方法初始化了两个字段,一个是在 super.group(parentGroup)中完成初始化,另一个是通过this.childGroup = childGroup,分别将bossGroup和workerGroup保存在AbstractBootstrap的group属性和ServerBootstrap的childGroup属性

接着从应用程序的启动代码 serverBootstrap.bind()来监听一个本地端口
通过bind方法会调用eventLoop()的execute()方法,最后会进入SingleThreadEventExecutor的execute()方法

    private static void doBind0(
            final ChannelFuture regFuture, final Channel channel,
            final SocketAddress localAddress, final ChannelPromise promise) {
        channel.eventLoop().execute(new Runnable() {
            @Override
            public void run() {
                if (regFuture.isSuccess()) {
                    channel.bind(localAddress, promise).addListener(ChannelFutureListener.CLOSE_ON_FAILURE);
                } else {
                    promise.setFailure(regFuture.cause());
                }
            }
        });
    }

SingleThreadEventExecutor对于添加进来的task,会判断当前执行的currentThread是否等于SingleThreadEventExecutor的thread,如果第一次添加或者当前调用的线程不是SingleThreadEventExecutor的thread,inEventLoop()就会返回false,就会先执行启动当前SingleThreadEventExecutor的startThread()方法再添加task到任务队列(LinkedBlockingQueue);否则就直接添加任务到任务队列

    private final Queue<Runnable> taskQueue;

    public void execute(Runnable task) {
        if (task == null) {
            throw new NullPointerException("task");
        }
        boolean inEventLoop = inEventLoop();
        if (inEventLoop) {
            addTask(task);
        } else {
            startThread();
            addTask(task);
            if (isShutdown() && removeTask(task)) {
                reject();
            }
        }
        //对于有新任务添加,就会执行wakeup
        if (!addTaskWakesUp && wakesUpForTask(task)) {
            wakeup(inEventLoop);
        }
    }

简单来说,这里的inEventLoop()就是判断当前线程是否是reactor线程,这样的作用是:

1.让task只在reactor线程进行,保证单线程

2.第一次判断会帮我们启动reactor线程

这里的startThread()就是通过一个标志判断reactor线程是否已启动,如果没有启动就执行doStartThread来启动,
SingleThreadEventExecutor 在执行doStartThread()方法的时候,会调用executor的execute方法,会将调用NioEventLoop(SingleThreadEventExecutor 的子类)的run方法封装成一个Runnable让线程池executor去执行(还会将当前线程保存在SingleThreadEventExecutor的thread属性字段里)。这里的executor就是前面讲到的ThreadPerTaskExecutor ,它的execute会对每个传入的Runnable创建一个FastThreadLocalThread线程对象并调用它的start方法去执行

 private void startThread() {
        //判断当前EventLoop线程是否有启动
        if (STATE_UPDATER.get(this) == ST_NOT_STARTED) {
            //进行了一次CAS操作,为了保证线程安全
            if (STATE_UPDATER.compareAndSet(this, ST_NOT_STARTED, ST_STARTED)) {
                doStartThread();
            }
        }
    }

private void doStartThread() {
        assert thread == null;
        executor.execute(new Runnable() {
            @Override
            public void run() {
                thread = Thread.currentThread();
                ...
                boolean success = false;
                updateLastExecutionTime();
                try {
                    SingleThreadEventExecutor.this.run();
                    success = true;
                } catch (Throwable t) {
                    logger.warn("Unexpected exception from an event executor: ", t);
                } 
                ...
            }
        });
    }

 通过前面的分析我们可以看出,最终执行的主体方法是:NioEventLoop的run方法,那么我们看下这里的run方法到底执行了什么

@Override
    protected void run() {
        for (;;) {
            try {           
                switch (selectStrategy.calculateStrategy(selectNowSupplier, hasTasks())) {
                    case SelectStrategy.CONTINUE:
                        continue;
                    case SelectStrategy.SELECT:
                        //select轮询, 设置wakenUp为false并返回之前的wakenUp值
                        select(wakenUp.getAndSet(false));
                        if (wakenUp.get()) {
                            selector.wakeup();
                        }
                    default:
                        // fallthrough
                }
                //去除了无关紧要的代码
                processSelectedKeys();
                runAllTasks();                
            } catch (Throwable t) {
                handleLoopException(t);
            }
            // Always handle shutdown even if the loop processing threw an exception.
           ...
        }
    }

先看下这里的策略选择

@Override
    public int calculateStrategy(IntSupplier selectSupplier, boolean hasTasks) throws Exception {
        return hasTasks ? selectSupplier.get() : SelectStrategy.SELECT;
    }

如果任务队列里没有task,就返回策略SELECT,否则就执行selectSupplier.get(),实际就是执行了一次selectNow(非阻塞)方法并返回

可以看到,上面的代码是一个死循环,做的事情主要是以下三个:

  • 轮询注册到reactor线程上的对应的selector的所有channel的IO事件
  • 根据不同的SelectKeys进行处理  processSelectedKeys();
  • 处理任务队列 runAllTasks();   

轮询Select

 private void select(boolean oldWakenUp) throws IOException {
        Selector selector = this.selector;
            int selectCnt = 0;
            long currentTimeNanos = System.nanoTime();
            long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos);
            for (;;) {
                long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;
                //第一个退出条件
                if (timeoutMillis <= 0) {
                    if (selectCnt == 0) {
                        selector.selectNow();
                        selectCnt = 1;
                    }
                    break;
                }

                // If a task was submitted when wakenUp value was true, the task didn't get a chance to call
                // Selector#wakeup. So we need to check task queue again before executing select operation.
                // If we don't, the task might be pended until select operation was timed out.
                // It might be pended until idle timeout if IdleStateHandler existed in pipeline.
                //第二个退出条件 
                if (hasTasks() && wakenUp.compareAndSet(false, true)) {
                    selector.selectNow();
                    selectCnt = 1;
                    break;
                }

                int selectedKeys = selector.select(timeoutMillis);
                selectCnt ++;

                //第三个退出条件
                if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
                    // - Selected something,
                    // - waken up by user, or
                    // - the task queue has a pending task.
                    // - a scheduled task is ready for processing
                    break;
                }
              
                ...
    }

不难看出这里的select是一个死循环,它的退出条件有三种:

  • 距离当前截止时间快到了(<=0.5ms)就跳出循环,如果此时还没有执行select,就执行一次selectNow
 long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;
 timeoutMillis <= 0;
  • 如果任务队列里有任务需要执行就退出(避免由于select阻塞导致任务不能及时执行),退出前也执行一下selectNow
  • selector.select(XX)的阻塞被唤醒后,如果满足上面的条件就会退出(selectedKeys不为0,任务队列里有任务等)

前面提到过,如果SingleThreadEventExecutor执行execute(Runnable task)添加任务会执行wakeup方法,然后会执行NioEventLoop重写的wakeup方法

@Override
public void execute(Runnable task) {
    //addTaskWakesUp 默认是false  如果是外部线程添加的,inEventLoop就会是false
    if (!addTaskWakesUp && wakesUpForTask(task)) {
        wakeup(inEventLoop);
    }
}

当inEventLoop为false,并且wakenUp变量CAS操作成功(由false变为true,保证线程安全),则调用selector.wakeup()唤醒阻塞的select方法

 @Override
    protected void wakeup(boolean inEventLoop) {
        if (!inEventLoop && wakenUp.compareAndSet(false, true)) {
            selector.wakeup();
        }
    }

Netty解决JDK空轮训Bug      

出现此 Bug 是因为当 Selector 的轮询结果为空,也没有wakeup 或新消息处理,则发生空
轮询,CPU 使用率达到100%,导致Nio Server不可用,Netty通过一种巧妙的方式来避开了这个空轮询问题

private void select(boolean oldWakenUp) throws IOException {
    long currentTimeNanos = System.nanoTime();
    for (;;) {
        ...
        int selectedKeys = selector.select(timeoutMillis);
        selectCnt ++;
        //解决jdk的nio bug
        long time = System.nanoTime();
        if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
            selectCnt = 1;
        } else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 && selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
            rebuildSelector();
            selector = this.selector;
            selector.selectNow();
            selectCnt = 1;
            break;
        }
        currentTimeNanos = time; 
    ...
 }
}

从上面的代码中可以看出,Selector每一次轮询都会进行计数,selectCnt++,开始轮询和轮询完成都会把当前时间戳赋值给currentTimeNanos和time,两个时间的时间差就是本次轮询消耗的时间

如果持续的时间大于等于timeoutMillis(轮询的时间),说明就是一次有效的轮询,重置selectCnt标志,否则,表明该阻塞方法并没有阻塞这么长时间,可能触发了jdk的空轮询bug,当空轮询的次数超过一个阀值的时候,默认是512,就开始重建selector

 public void rebuildSelector() {
        final Selector oldSelector = selector;
        final Selector newSelector;
        newSelector = openSelector();
        int nChannels = 0;
        for (;;) {
            try {
                for (SelectionKey key: oldSelector.keys()) {
                    Object a = key.attachment();
                    if (!key.isValid() || key.channel().keyFor(newSelector) != null) {
                          continue;
                    }
                    int interestOps = key.interestOps();
                    key.cancel();
                    SelectionKey newKey = key.channel().register(newSelector, interestOps, a);
                    if (a instanceof AbstractNioChannel) {
                         // Update SelectionKey
                         ((AbstractNioChannel) a).selectionKey = newKey;
                    }
                    nChannels ++;
                }
            } catch (ConcurrentModificationException e) {
                // Probably due to concurrent modification of the key set.
                continue;
            }
            break;
        }
        selector = newSelector;
        oldSelector.close();
    }

rebuildSelector主要做了三件事:

  • 创建一个新的 Selector。
  • 将原来Selector 中注册的事件全部取消。
  • 将可用事件重新注册到新的 Selector 中,并激活。

参考: 
netty源码分析之揭开reactor线程的面纱

Netty 源码分析-EventLoop

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章