Dubbo/Netty中时间轮算法的原理

在Dubbo中,为增强系统的容错能力,在很多地方需要用到只需进行一次执行的任务调度。比如RPC调用的超时机制的实现,消费者需要各个RPC调用是否超时,如果超时会将超时结果返回给应用层。在Dubbo最开始的实现中,是采用将所有的返回结果(DefaultFuture)都放入一个集合中,并且通过一个定时任务,每隔一定时间间隔就扫描所有的future,逐个判断是否超时。

这样的实现方式实现起来比较简单,但是存在一个问题就是会有很多无意义的遍历操作。比如一个RPC调用的超时时间是10秒,而我的超时判定定时任务是2秒执行一次,那么可能会有4次左右无意义的轮询操作。

为了解决类似的场景中的问题,Dubbo借鉴Netty,引入了时间轮算法,用来对只需要执行一次的任务进行调度。时间轮算法的原理可以参见这篇文章,https://blog.csdn.net/mindfloating/article/details/8033340

下面主要分析一下Dubbo/Netty中时间轮算法的实现。Dubbo/Netty中时间轮算法主要有以下几个类实现:
在这里插入图片描述

Timer接口

/**
 * Schedules {@link TimerTask}s for one-time future execution in a background
 * thread.
 */
public interface Timer {

    /**
     * Schedules the specified {@link TimerTask} for one-time execution after
     * the specified delay.
     *
     * @return a handle which is associated with the specified task
     * @throws IllegalStateException      if this timer has been {@linkplain #stop() stopped} already
     * @throws RejectedExecutionException if the pending timeouts are too many and creating new timeout
     *                                    can cause instability in the system.
     */
    Timeout newTimeout(TimerTask task, long delay, TimeUnit unit);

    /**
     * Releases all resources acquired by this {@link Timer} and cancels all
     * tasks which were scheduled but not executed yet.
     *
     * @return the handles associated with the tasks which were canceled by
     * this method
     */
    Set<Timeout> stop();

    /**
     * the timer is stop
     *
     * @return true for stop
     */
    boolean isStop();
}

这个接口是一个调度的核心接口,从注释可以看出,它主要用于在后台执行一次性的调度。它有一个isStop方法,用来判断这个调度器是否停止运行,还有一个stop方法用来停止调度器的运行。再看newTimeout这个方法,这个方法就是把一个任务扔给调度器执行,第一个参数类型TimerTask,即需要执行的任务,第二个参数类型long,即执行此任务的相对延迟时间,第三个是一个时间单位,也就是第二个参数对应的时间单位。接下来看它的入参TimerTask

TimerTask接口

/**
 * A task which is executed after the delay specified with
 * {@link Timer#newTimeout(TimerTask, long, TimeUnit)} (TimerTask, long, TimeUnit)}.
 */
public interface TimerTask {

    /**
     * Executed after the delay specified with
     * {@link Timer#newTimeout(TimerTask, long, TimeUnit)}.
     *
     * @param timeout a handle which is associated with this task
     */
    void run(Timeout timeout) throws Exception;
}

这个类就代表调度器要执行的任务,它只有一个方法run,参数类型是Timeout,我们注意到上面Timer接口的newTimeout这个方法返回的参数就是Timeout,和此处的入参相同,大胆猜测这里传入的Timeout参数应该就是newTimeout的返回值。(留待后文验证)

Timeout接口

/**
 * A handle associated with a {@link TimerTask} that is returned by a
 * {@link Timer}.
 */
public interface Timeout {

    /**
     * Returns the {@link Timer} that created this handle.
     */
    Timer timer();

    /**
     * Returns the {@link TimerTask} which is associated with this handle.
     */
    TimerTask task();

    /**
     * Returns {@code true} if and only if the {@link TimerTask} associated
     * with this handle has been expired.
     */
    boolean isExpired();

    /**
     * Returns {@code true} if and only if the {@link TimerTask} associated
     * with this handle has been cancelled.
     */
    boolean isCancelled();

    /**
     * Attempts to cancel the {@link TimerTask} associated with this handle.
     * If the task has been executed or cancelled already, it will return with
     * no side effect.
     *
     * @return True if the cancellation completed successfully, otherwise false
     */
    boolean cancel();
}

Timeout代表的是对一次任务的处理。timer方法返回的就是创建这个Timeout的Timer对象,task返回的是这个Timeout处理的任务,isExpired代表的是这个任务是否已经超过它预设的时间,isCancelled是返回是否已取消此任务,cancel则是取消此任务。

以上者几个接口就从逻辑上构成了一个任务调度器系统。我们从各个接口的入参和返回值可以看出,这几个接口设计的很巧妙,往往是某个类创建了另一个类的对象,然后它创建的对象又可以通过方法获取到创建它的对象。这种设计方式在spring框架中也是经常出现的。可以看出在设计一个复杂的系统时这是一种很有效的方式。可以学习一下。

下面就开始看本文的重点,时间轮调度器的实现HashedWheelTimer。首先是类头:

/**
 * A {@link Timer} optimized for approximated I/O timeout scheduling.
 *
 * <h3>Tick Duration</h3>
 * <p>
 * As described with 'approximated', this timer does not execute the scheduled
 * {@link TimerTask} on time.  {@link HashedWheelTimer}, on every tick, will
 * check if there are any {@link TimerTask}s behind the schedule and execute
 * them.
 * <p>
 * You can increase or decrease the accuracy of the execution timing by
 * specifying smaller or larger tick duration in the constructor.  In most
 * network applications, I/O timeout does not need to be accurate.  Therefore,
 * the default tick duration is 100 milliseconds and you will not need to try
 * different configurations in most cases.
 *
 * <h3>Ticks per Wheel (Wheel Size)</h3>
 * <p>
 * {@link HashedWheelTimer} maintains a data structure called 'wheel'.
 * To put simply, a wheel is a hash table of {@link TimerTask}s whose hash
 * function is 'dead line of the task'.  The default number of ticks per wheel
 * (i.e. the size of the wheel) is 512.  You could specify a larger value
 * if you are going to schedule a lot of timeouts.
 *
 * <h3>Do not create many instances.</h3>
 * <p>
 * {@link HashedWheelTimer} creates a new thread whenever it is instantiated and
 * started.  Therefore, you should make sure to create only one instance and
 * share it across your application.  One of the common mistakes, that makes
 * your application unresponsive, is to create a new instance for every connection.
 *
 * <h3>Implementation Details</h3>
 * <p>
 * {@link HashedWheelTimer} is based on
 * <a href="http://cseweb.ucsd.edu/users/varghese/">George Varghese</a> and
 * Tony Lauck's paper,
 * <a href="http://cseweb.ucsd.edu/users/varghese/PAPERS/twheel.ps.Z">'Hashed
 * and Hierarchical Timing Wheels: data structures to efficiently implement a
 * timer facility'</a>.  More comprehensive slides are located
 * <a href="http://www.cse.wustl.edu/~cdgill/courses/cs6874/TimingWheels.ppt">here</a>.
 */
public class HashedWheelTimer implements Timer {

从注释可以看出,该类并不提供准确的定时执行任务的功能,也就是不能指定几点几分几秒准时执行某个任务,而是在每个tick(也就是时间轮的一个“时间槽”)中,检测是否存在TimerTask已经落后于当前时间,如果是则执行它。(相信了解了时间轮算法的同学,应该是很容易理解这段话的意思的。)我们可以通过设定更小或更大的tick duration(时间槽的持续时间),来提高或降低执行时间的准确率。这句话也很好理解,比如我一个时间槽有1秒,和一个时间槽是5秒,那准确度相差5倍。注释继续说,在大多数网络应用程序中,IO超时不必须是准确的,也就是比如说我要求5秒就超时,那框架不是说必须要在5秒刚好超时的那个点告诉我超时,也可以稍微晚一点点也无所谓。因此,默认的tick duration是100毫秒,我们在大多数场景下并不需要修改它。

这个类维护了一种称为“wheel”的数据结构,也就是我们说的时间轮。简单地说,一个wheel就是一个hash table,它的hash函数是任务的截止时间,也就是我们要通过hash函数把这个任务放到它应该在的时间槽中,这样随着时间的推移,当我们进入某个时间槽中时,这个槽中的任务也刚好到了它该执行的时间。这样就避免了在每一个槽中都需要检测所有任务是否需要执行。默认的时间槽的数量是512,如果我们需要调度非常多的任务,我们可以自定义这个值。

这个类在系统中只需要创建一个实例,因为它在每次被初始化并开始运行的时候,会创建一个新的线程。一个常见的使用错误是,对每个连接(这里应该是Netty中的注释,因为这个类主要用在处理连接,这里的连接可以理解为任务)都创建一个这个类,这将导致应用程序变得不可响应(开的线程太多)。

下面就是介绍这个类的实现原理依据的论文,就不看了。下面直接看代码。首先是field。

   /**
     * may be in spi?
     */
    public static final String NAME = "hased";

    private static final Logger logger = LoggerFactory.getLogger(HashedWheelTimer.class);

    // 实例计数器,用于记录创建了多少个本类的对象
    private static final AtomicInteger INSTANCE_COUNTER = new AtomicInteger();
    // 用于对象数超过限制时的告警
    private static final AtomicBoolean WARNED_TOO_MANY_INSTANCES = new AtomicBoolean();
    // 实例上限
    private static final int INSTANCE_COUNT_LIMIT = 64;
    // 原子化更新workState变量的工具
    private static final AtomicIntegerFieldUpdater<HashedWheelTimer> WORKER_STATE_UPDATER =
            AtomicIntegerFieldUpdater.newUpdater(HashedWheelTimer.class, "workerState");
    // 推动时间轮运转的执行类
    private final Worker worker = new Worker();
    // 绑定的执行线程
    private final Thread workerThread;

    // WORKER初始化状态
    private static final int WORKER_STATE_INIT = 0;
    // WORKER已开始状态
    private static final int WORKER_STATE_STARTED = 1;
    // WORKER已停止状态
    private static final int WORKER_STATE_SHUTDOWN = 2;

    /**
     * 0 - init, 1 - started, 2 - shut down
     */
    @SuppressWarnings({"unused", "FieldMayBeFinal"})
    private volatile int workerState;

	// 时间槽持续时间
    private final long tickDuration;
    // 时间槽数组
    private final HashedWheelBucket[] wheel;
    // 计算任务应该放到哪个时间槽时使用的掩码
    private final int mask;
    // 线程任务同步工具
    private final CountDownLatch startTimeInitialized = new CountDownLatch(1);
    // 保存任务调度的队列
    private final Queue<HashedWheelTimeout> timeouts = new LinkedBlockingQueue<>();
    // 已取消的任务调度队列
    private final Queue<HashedWheelTimeout> cancelledTimeouts = new LinkedBlockingQueue<>();
    // 等待中的任务调度数量
    private final AtomicLong pendingTimeouts = new AtomicLong(0);
    // 最大等待任务调度数量
    private final long maxPendingTimeouts;
    // 时间轮的初始时间
    private volatile long startTime;

可能有部分参数的作用看不太懂,结合下文就可以看懂了。首先就看一下这个方法的构造器吧。

/**
     * Creates a new timer.
     *
     * @param threadFactory      a {@link ThreadFactory} that creates a
     *                           background {@link Thread} which is dedicated to
     *                           {@link TimerTask} execution.
     * @param tickDuration       the duration between tick
     * @param unit               the time unit of the {@code tickDuration}
     * @param ticksPerWheel      the size of the wheel
     * @param maxPendingTimeouts The maximum number of pending timeouts after which call to
     *                           {@code newTimeout} will result in
     *                           {@link java.util.concurrent.RejectedExecutionException}
     *                           being thrown. No maximum pending timeouts limit is assumed if
     *                           this value is 0 or negative.
     * @throws NullPointerException     if either of {@code threadFactory} and {@code unit} is {@code null}
     * @throws IllegalArgumentException if either of {@code tickDuration} and {@code ticksPerWheel} is &lt;= 0
     */
    public HashedWheelTimer(
            ThreadFactory threadFactory,
            long tickDuration, TimeUnit unit, int ticksPerWheel,
            long maxPendingTimeouts) {

        if (threadFactory == null) {
            throw new NullPointerException("threadFactory");
        }
        if (unit == null) {
            throw new NullPointerException("unit");
        }
        if (tickDuration <= 0) {
            throw new IllegalArgumentException("tickDuration must be greater than 0: " + tickDuration);
        }
        if (ticksPerWheel <= 0) {
            throw new IllegalArgumentException("ticksPerWheel must be greater than 0: " + ticksPerWheel);
        }

        // Normalize ticksPerWheel to power of two and initialize the wheel.
        wheel = createWheel(ticksPerWheel);
        mask = wheel.length - 1;

        // Convert tickDuration to nanos.
        this.tickDuration = unit.toNanos(tickDuration);

        // Prevent overflow.
        if (this.tickDuration >= Long.MAX_VALUE / wheel.length) {
            throw new IllegalArgumentException(String.format(
                    "tickDuration: %d (expected: 0 < tickDuration in nanos < %d",
                    tickDuration, Long.MAX_VALUE / wheel.length));
        }
        workerThread = threadFactory.newThread(worker);

        this.maxPendingTimeouts = maxPendingTimeouts;

        if (INSTANCE_COUNTER.incrementAndGet() > INSTANCE_COUNT_LIMIT &&
                WARNED_TOO_MANY_INSTANCES.compareAndSet(false, true)) {
            reportTooManyInstances();
        }
    }

参数的英文注释不再翻译。看主要逻辑,
1.首先是校验了参数
2.很关键的创建时间轮,也就是初始化下面上面提到的wheel这个数组,因为这个数组就是代表hash表的数组。
3.初始化了mask这个掩码,它的值为wheel.length - 1,初始化为这个值是为了计算方便,后面会说到。
4.之后初始化了时间槽持续时间。并进行了溢出判断,即如果Long类型的最大值除以时间槽的个数,得出的结果小于传入的时间槽设定时间,会抛异常。
5.设定最大等待任务调度数
6.判断对象数量是否超过最大限制,若超过则报告。

下面展开上面的createWheel方法

	private static HashedWheelBucket[] createWheel(int ticksPerWheel) {
        if (ticksPerWheel <= 0) {
            throw new IllegalArgumentException(
                    "ticksPerWheel must be greater than 0: " + ticksPerWheel);
        }
        if (ticksPerWheel > 1073741824) {
            throw new IllegalArgumentException(
                    "ticksPerWheel may not be greater than 2^30: " + ticksPerWheel);
        }

        ticksPerWheel = normalizeTicksPerWheel(ticksPerWheel);
        HashedWheelBucket[] wheel = new HashedWheelBucket[ticksPerWheel];
        for (int i = 0; i < wheel.length; i++) {
            wheel[i] = new HashedWheelBucket();
        }
        return wheel;
    }

忽略基本的参数校验,看主要流程
1.对时间槽数量进行规范化处理
2.创建时间槽数组
3.初始化时间槽数组的每个参数

对时间槽数量的规范化处理

	private static int normalizeTicksPerWheel(int ticksPerWheel) {
        int normalizedTicksPerWheel = ticksPerWheel - 1;
        normalizedTicksPerWheel |= normalizedTicksPerWheel >>> 1;
        normalizedTicksPerWheel |= normalizedTicksPerWheel >>> 2;
        normalizedTicksPerWheel |= normalizedTicksPerWheel >>> 4;
        normalizedTicksPerWheel |= normalizedTicksPerWheel >>> 8;
        normalizedTicksPerWheel |= normalizedTicksPerWheel >>> 16;
        return normalizedTicksPerWheel + 1;
    }

假设输入的值是37,计算之后返回的结果为64,可以看出此方法的作用在于,将传入的参数修改为大于等于它的最小的2的次幂。

HashedWheelBucket这个类就是时间槽(也可以叫桶,Bucket,一个意思)。构造它使用的是默认构造函数。对于它的实现,后面再分析。

newTimeout方法

	@Override
    public Timeout newTimeout(TimerTask task, long delay, TimeUnit unit) {
        if (task == null) {
            throw new NullPointerException("task");
        }
        if (unit == null) {
            throw new NullPointerException("unit");
        }

        long pendingTimeoutsCount = pendingTimeouts.incrementAndGet();

        if (maxPendingTimeouts > 0 && pendingTimeoutsCount > maxPendingTimeouts) {
            pendingTimeouts.decrementAndGet();
            throw new RejectedExecutionException("Number of pending timeouts ("
                    + pendingTimeoutsCount + ") is greater than or equal to maximum allowed pending "
                    + "timeouts (" + maxPendingTimeouts + ")");
        }

        start();

        // Add the timeout to the timeout queue which will be processed on the next tick.
        // During processing all the queued HashedWheelTimeouts will be added to the correct HashedWheelBucket.
        long deadline = System.nanoTime() + unit.toNanos(delay) - startTime;

        // Guard against overflow.
        if (delay > 0 && deadline < 0) {
            deadline = Long.MAX_VALUE;
        }
        HashedWheelTimeout timeout = new HashedWheelTimeout(this, task, deadline);
        timeouts.add(timeout);
        return timeout;
    }

这个方法就是向调度器添加一个待执行任务。忽略基本参数校验,主要流程:
1.将等待任务调度数加1,若等待数量超过最大限制,则减1并抛异常
2.启动时间轮(并不是每次都启动,只会启动一次,start方法里会有判断,后面再看)
3.计算当前任务的截止时间(也就是要执行的时间),并进行防溢出处理
4.构造一个Timeout,并放入等待任务调度队列中

start方法

	/**
     * Starts the background thread explicitly.  The background thread will
     * start automatically on demand even if you did not call this method.
     *
     * @throws IllegalStateException if this timer has been
     *                               {@linkplain #stop() stopped} already
     */
    public void start() {
        switch (WORKER_STATE_UPDATER.get(this)) {
            case WORKER_STATE_INIT:
                if (WORKER_STATE_UPDATER.compareAndSet(this, WORKER_STATE_INIT, WORKER_STATE_STARTED)) {
                    workerThread.start();
                }
                break;
            case WORKER_STATE_STARTED:
                break;
            case WORKER_STATE_SHUTDOWN:
                throw new IllegalStateException("cannot be started once stopped");
            default:
                throw new Error("Invalid WorkerState");
        }

        // Wait until the startTime is initialized by the worker.
        while (startTime == 0) {
            try {
                startTimeInitialized.await();
            } catch (InterruptedException ignore) {
                // Ignore - it will be ready very soon.
            }
        }
    }

1.获取WORKER运行状态,若是初始化,则更新到已启动状态,并启动workThread线程,若是其他状态,做相应处理
2.若startTime==0,则在此线程中等待workThread将startTime初始化完成

此方法也很简单,就是启动定时器背后的执行线程,同时利用CountLatchDown等待startTime初始化为0,这里为什么要等待为0呢?答案就是上面的newTimeout方法中,在start之后会用到这个startTime,如果它没有初始化完成的化,计算会有问题。

到此为止,利用HashedWheelTimer添加一个待执行任务的主体流程已经完成。下面再看一下时间轮内部是如何运转的。下面先看Worker这个类

Worker

fields

		private final Set<Timeout> unprocessedTimeouts = new HashSet<Timeout>();

        private long tick;

第一个集合参数是没有处理的任务调度集合,第二个参数是当前执行的tick(也就是当前执行到哪个时间槽了)。

run方法

		@Override
        public void run() {
            // Initialize the startTime.
            startTime = System.nanoTime();
            if (startTime == 0) {
                // We use 0 as an indicator for the uninitialized value here, so make sure it's not 0 when initialized.
                startTime = 1;
            }

            // Notify the other threads waiting for the initialization at start().
            startTimeInitialized.countDown();

            do {
                final long deadline = waitForNextTick();
                if (deadline > 0) {
                    int idx = (int) (tick & mask);
                    processCancelledTasks();
                    HashedWheelBucket bucket =
                            wheel[idx];
                    transferTimeoutsToBuckets();
                    bucket.expireTimeouts(deadline);
                    tick++;
                }
            } while (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_STARTED);

            // Fill the unprocessedTimeouts so we can return them from stop() method.
            for (HashedWheelBucket bucket : wheel) {
                bucket.clearTimeouts(unprocessedTimeouts);
            }
            for (; ; ) {
                HashedWheelTimeout timeout = timeouts.poll();
                if (timeout == null) {
                    break;
                }
                if (!timeout.isCancelled()) {
                    unprocessedTimeouts.add(timeout);
                }
            }
            processCancelledTasks();
        }

主要逻辑
1.初始化startTime,如果startTime为0,则初始化为1。这里为什么要判断是否为0呢?我们知道,java中获取当前时间有两种方法,一个是System.currentTimeMillis()它返回的是国际通用时间UTC中,距离1970年1月1日零点之间的毫秒数。另一个就是这里用的System.nanoTime(),它返回的是当前时间距离虚拟机中某个固定时间点之间的时间差,单位为毫微秒,但这个固定时间每台虚拟机都不一样,所以它只能用于计算时间差。回到上面这个方法,如果执行nanoTime的时刻刚好是这个固定时间,丝毫不差,那返回值就是0。所以这里为了防止不知道多少分之一的可能性,需要判断一下是否为0。
2.因为startTime已经初始化完成,所以startLatchDown通知等待的线程,可以继续执行了。
3.接下来是一个for循环,当定时器一直是已启动的状态时,不断地推进tick前进。推进的过程:
1)等待下一个tick的到来
2)tick到来之后,计算tick对应时间槽数组中的那个槽(这里tick&mask,就相当于对时间槽数组的长度取模运算)
3)处理已取消任务调度队列
4)获取当前时间槽,并将待处理任务队列中的任务放到它们应该放的槽中
5)当前时间槽执行它包含的任务

4.若时间轮已被停止,则执行下列流程:
1)清理所有时间槽中的未处理任务调度
2)清理待处理任务调度队列,将未取消的加入到未处理集合中
3)处理已取消的任务调度队列

waitForNextTick方法

		/**
         * calculate goal nanoTime from startTime and current tick number,
         * then wait until that goal has been reached.
         *
         * @return Long.MIN_VALUE if received a shutdown request,
         * current time otherwise (with Long.MIN_VALUE changed by +1)
         */
        private long waitForNextTick() {
            long deadline = tickDuration * (tick + 1);

            for (; ; ) {
                final long currentTime = System.nanoTime() - startTime;
                long sleepTimeMs = (deadline - currentTime + 999999) / 1000000;

                if (sleepTimeMs <= 0) {
                    if (currentTime == Long.MIN_VALUE) {
                        return -Long.MAX_VALUE;
                    } else {
                        return currentTime;
                    }
                }
                if (isWindows()) {
                    sleepTimeMs = sleepTimeMs / 10 * 10;
                }

                try {
                    Thread.sleep(sleepTimeMs);
                } catch (InterruptedException ignored) {
                    if (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_SHUTDOWN) {
                        return Long.MIN_VALUE;
                    }
                }
            }
        }

        Set<Timeout> unprocessedTimeouts() {
            return Collections.unmodifiableSet(unprocessedTimeouts);
        }
    }

主要流程:
1.计算下一个tick的开始时间
2.循环等待直到时间到达下一个tick开始时间,这里sleepTimeMs <= 0,等价于deadline - currentTime <= -999999(毫微秒),也就是说当前时间超过下一个tick 999999毫微秒了,才到时间。这里就会返回了
3.计算一个睡眠时间,然后线程睡眠一下。

processCancelledTasks方法

		private void processCancelledTasks() {
            for (; ; ) {
                HashedWheelTimeout timeout = cancelledTimeouts.poll();
                if (timeout == null) {
                    // all processed
                    break;
                }
                try {
                    timeout.remove();
                } catch (Throwable t) {
                    if (logger.isWarnEnabled()) {
                        logger.warn("An exception was thrown while process a cancellation task", t);
                    }
                }
            }
        }

for循环:
1.从已取消队列中取出第一个被取消的任务调度
2.调用HashedWheelTimeout的remove方法进行移除,这个方法后面再看

transferTimeoutsToBuckets方法

		private void transferTimeoutsToBuckets() {
            // transfer only max. 100000 timeouts per tick to prevent a thread to stale the workerThread when it just
            // adds new timeouts in a loop.
            for (int i = 0; i < 100000; i++) {
                HashedWheelTimeout timeout = timeouts.poll();
                if (timeout == null) {
                    // all processed
                    break;
                }
                if (timeout.state() == HashedWheelTimeout.ST_CANCELLED) {
                    // Was cancelled in the meantime.
                    continue;
                }

                long calculated = timeout.deadline / tickDuration;
                timeout.remainingRounds = (calculated - tick) / wheel.length;

                // Ensure we don't schedule for past.
                final long ticks = Math.max(calculated, tick);
                int stopIndex = (int) (ticks & mask);

                HashedWheelBucket bucket = wheel[stopIndex];
                bucket.addTimeout(timeout);
            }
        }

for循环10000次(这里只循环有限次,是为了防止待处理队列过大,导致这一次添加到对应槽的过程太过耗时):
1.从待处理任务调度队列中取出第一个任务,进行校验
2.根据取出的待处理任务调度,计算出一个槽
3.设置此任务调度的remaininRounds(剩余圈数),因为时间轮是一个轮,所以可能会有还需要过几圈的时间才能执行到的任务
4.取计算出的槽和当前槽中的较大者,并进行取模
5.将此任务调度加入对应的槽中

上面已经介绍完了时间槽运转的主体流程。相信大家还有很多不明白的地方,下面再介绍一下HashedWheelTimeout和HashedWheelBucket这两个类。首先是HashedWheelTimeout这个类。

HashedWheelTimeout类

	private static final class HashedWheelTimeout implements Timeout {

        // 初始化状态
		private static final int ST_INIT = 0;
		// 已取消状态
        private static final int ST_CANCELLED = 1;
        // 已超时状态
        private static final int ST_EXPIRED = 2;
        // state属性获取器
        private static final AtomicIntegerFieldUpdater<HashedWheelTimeout> STATE_UPDATER =
                AtomicIntegerFieldUpdater.newUpdater(HashedWheelTimeout.class, "state");
        
        // 调度器
        private final HashedWheelTimer timer;
        // 调度任务
        private final TimerTask task;
        // 截止时间
        private final long deadline;

        @SuppressWarnings({"unused", "FieldMayBeFinal", "RedundantFieldInitialization"})
        private volatile int state = ST_INIT;

        /**
         * RemainingRounds will be calculated and set by Worker.transferTimeoutsToBuckets() before the
         * HashedWheelTimeout will be added to the correct HashedWheelBucket.
         */
        long remainingRounds;

        /**
         * This will be used to chain timeouts in HashedWheelTimerBucket via a double-linked-list.
         * As only the workerThread will act on it there is no need for synchronization / volatile.
         */
        HashedWheelTimeout next;
        HashedWheelTimeout prev;

        /**
         * The bucket to which the timeout was added
         */
        HashedWheelBucket bucket;

        HashedWheelTimeout(HashedWheelTimer timer, TimerTask task, long deadline) {
            this.timer = timer;
            this.task = task;
            this.deadline = deadline;
        }

        @Override
        public Timer timer() {
            return timer;
        }

        @Override
        public TimerTask task() {
            return task;
        }

        @Override
        public boolean cancel() {
            // only update the state it will be removed from HashedWheelBucket on next tick.
            if (!compareAndSetState(ST_INIT, ST_CANCELLED)) {
                return false;
            }
            // If a task should be canceled we put this to another queue which will be processed on each tick.
            // So this means that we will have a GC latency of max. 1 tick duration which is good enough. This way
            // we can make again use of our MpscLinkedQueue and so minimize the locking / overhead as much as possible.
            timer.cancelledTimeouts.add(this);
            return true;
        }

        void remove() {
            HashedWheelBucket bucket = this.bucket;
            if (bucket != null) {
                bucket.remove(this);
            } else {
                timer.pendingTimeouts.decrementAndGet();
            }
        }

        public boolean compareAndSetState(int expected, int state) {
            return STATE_UPDATER.compareAndSet(this, expected, state);
        }

        public int state() {
            return state;
        }

        @Override
        public boolean isCancelled() {
            return state() == ST_CANCELLED;
        }

        @Override
        public boolean isExpired() {
            return state() == ST_EXPIRED;
        }

        public void expire() {
            if (!compareAndSetState(ST_INIT, ST_EXPIRED)) {
                return;
            }

            try {
                task.run(this);
            } catch (Throwable t) {
                if (logger.isWarnEnabled()) {
                    logger.warn("An exception was thrown by " + TimerTask.class.getSimpleName() + '.', t);
                }
            }
        }

        @Override
        public String toString() {
            final long currentTime = System.nanoTime();
            long remaining = deadline - currentTime + timer.startTime;
            String simpleClassName = ClassUtils.simpleClassName(this.getClass());

            StringBuilder buf = new StringBuilder(192)
                    .append(simpleClassName)
                    .append('(')
                    .append("deadline: ");
            if (remaining > 0) {
                buf.append(remaining)
                        .append(" ns later");
            } else if (remaining < 0) {
                buf.append(-remaining)
                        .append(" ns ago");
            } else {
                buf.append("now");
            }

            if (isCancelled()) {
                buf.append(", cancelled");
            }

            return buf.append(", task: ")
                    .append(task())
                    .append(')')
                    .toString();
        }
    }

可以看到,这个类逻辑比较简单,基本都是赋值或读取值的操作,或者是委托给HashedWheelBucket这个类进行操作,就不做过多介绍,大家可以自行学习。需要注意的一点是next何prev这两个属性,我们知道,一个HashedWheelBucket会挂载多个HashedWheelTimeout,这个next和prev就是用于实现一个双向链表的结构,这样同属于一个HashedWheelBucket的HashedWheelTimeout就可以以双向链表的形式挂载在HashedWheelBucket上了。

HashedWheelBucket

fields:

		/**
         * Used for the linked-list datastructure
         */
        private HashedWheelTimeout head;
        private HashedWheelTimeout tail;

如上文所述,这两个参数就是HashedWheelTimeout双向链表的头尾指针。

addTimeout和remove方法

		/**
         * Add {@link HashedWheelTimeout} to this bucket.
         */
        void addTimeout(HashedWheelTimeout timeout) {
            assert timeout.bucket == null;
            timeout.bucket = this;
            if (head == null) {
                head = tail = timeout;
            } else {
                tail.next = timeout;
                timeout.prev = tail;
                tail = timeout;
            }
        }

		public HashedWheelTimeout remove(HashedWheelTimeout timeout) {
            HashedWheelTimeout next = timeout.next;
            // remove timeout that was either processed or cancelled by updating the linked-list
            if (timeout.prev != null) {
                timeout.prev.next = next;
            }
            if (timeout.next != null) {
                timeout.next.prev = timeout.prev;
            }

            if (timeout == head) {
                // if timeout is also the tail we need to adjust the entry too
                if (timeout == tail) {
                    tail = null;
                    head = null;
                } else {
                    head = next;
                }
            } else if (timeout == tail) {
                // if the timeout is the tail modify the tail to be the prev node.
                tail = timeout.prev;
            }
            // null out prev, next and bucket to allow for GC.
            timeout.prev = null;
            timeout.next = null;
            timeout.bucket = null;
            timeout.timer.pendingTimeouts.decrementAndGet();
            return next;
        }

这两个方法平平无奇,就是双向链表的添加和删除操作。

expireTimeouts方法

		/**
         * Expire all {@link HashedWheelTimeout}s for the given {@code deadline}.
         */
        void expireTimeouts(long deadline) {
            HashedWheelTimeout timeout = head;

            // process all timeouts
            while (timeout != null) {
                HashedWheelTimeout next = timeout.next;
                if (timeout.remainingRounds <= 0) {
                    next = remove(timeout);
                    if (timeout.deadline <= deadline) {
                        timeout.expire();
                    } else {
                        // The timeout was placed into a wrong slot. This should never happen.
                        throw new IllegalStateException(String.format(
                                "timeout.deadline (%d) > deadline (%d)", timeout.deadline, deadline));
                    }
                } else if (timeout.isCancelled()) {
                    next = remove(timeout);
                } else {
                    timeout.remainingRounds--;
                }
                timeout = next;
            }
        }

这个方法就是实际将一个时间槽中所有挂载的任务调度执行的方法。可以看出逻辑也比较简单,就是从头遍历timeout的双向链表,对每一个timeout进行处理,处理的流程就是,先判断剩余圈数是否小于等于0,如果是,再判断它的截止时间是否小于当前截止时间,如果小於则进行expire,实际也就是包含了执行这个任务的操作。主要逻辑就是这个,其他次要逻辑就不说了,相信看一下就能明白

总结

以上介绍了Dubbo中的时间轮定时器的原理和实现,它主要是通过Timer,Timeout,TimerTask几个接口定义了一个定时器的模型,再通过HashedWheelTimer这个类(包括其内部类)实现了一个时间轮定时器。它对外提供了简单易用的接口,只需要调用newTimer接口,就可以实现对只需执行一次任务的调度。通过该定时器,Dubbo在响应的场景中实现了高效的任务调度。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章