在Dubbo中,为增强系统的容错能力,在很多地方需要用到只需进行一次执行的任务调度。比如RPC调用的超时机制的实现,消费者需要各个RPC调用是否超时,如果超时会将超时结果返回给应用层。在Dubbo最开始的实现中,是采用将所有的返回结果(DefaultFuture)都放入一个集合中,并且通过一个定时任务,每隔一定时间间隔就扫描所有的future,逐个判断是否超时。
这样的实现方式实现起来比较简单,但是存在一个问题就是会有很多无意义的遍历操作。比如一个RPC调用的超时时间是10秒,而我的超时判定定时任务是2秒执行一次,那么可能会有4次左右无意义的轮询操作。
为了解决类似的场景中的问题,Dubbo借鉴Netty,引入了时间轮算法,用来对只需要执行一次的任务进行调度。时间轮算法的原理可以参见这篇文章,https://blog.csdn.net/mindfloating/article/details/8033340
下面主要分析一下Dubbo/Netty中时间轮算法的实现。Dubbo/Netty中时间轮算法主要有以下几个类实现:
Timer接口
/**
* Schedules {@link TimerTask}s for one-time future execution in a background
* thread.
*/
public interface Timer {
/**
* Schedules the specified {@link TimerTask} for one-time execution after
* the specified delay.
*
* @return a handle which is associated with the specified task
* @throws IllegalStateException if this timer has been {@linkplain #stop() stopped} already
* @throws RejectedExecutionException if the pending timeouts are too many and creating new timeout
* can cause instability in the system.
*/
Timeout newTimeout(TimerTask task, long delay, TimeUnit unit);
/**
* Releases all resources acquired by this {@link Timer} and cancels all
* tasks which were scheduled but not executed yet.
*
* @return the handles associated with the tasks which were canceled by
* this method
*/
Set<Timeout> stop();
/**
* the timer is stop
*
* @return true for stop
*/
boolean isStop();
}
这个接口是一个调度的核心接口,从注释可以看出,它主要用于在后台执行一次性的调度。它有一个isStop方法,用来判断这个调度器是否停止运行,还有一个stop方法用来停止调度器的运行。再看newTimeout这个方法,这个方法就是把一个任务扔给调度器执行,第一个参数类型TimerTask,即需要执行的任务,第二个参数类型long,即执行此任务的相对延迟时间,第三个是一个时间单位,也就是第二个参数对应的时间单位。接下来看它的入参TimerTask
TimerTask接口
/**
* A task which is executed after the delay specified with
* {@link Timer#newTimeout(TimerTask, long, TimeUnit)} (TimerTask, long, TimeUnit)}.
*/
public interface TimerTask {
/**
* Executed after the delay specified with
* {@link Timer#newTimeout(TimerTask, long, TimeUnit)}.
*
* @param timeout a handle which is associated with this task
*/
void run(Timeout timeout) throws Exception;
}
这个类就代表调度器要执行的任务,它只有一个方法run,参数类型是Timeout,我们注意到上面Timer接口的newTimeout这个方法返回的参数就是Timeout,和此处的入参相同,大胆猜测这里传入的Timeout参数应该就是newTimeout的返回值。(留待后文验证)
Timeout接口
/**
* A handle associated with a {@link TimerTask} that is returned by a
* {@link Timer}.
*/
public interface Timeout {
/**
* Returns the {@link Timer} that created this handle.
*/
Timer timer();
/**
* Returns the {@link TimerTask} which is associated with this handle.
*/
TimerTask task();
/**
* Returns {@code true} if and only if the {@link TimerTask} associated
* with this handle has been expired.
*/
boolean isExpired();
/**
* Returns {@code true} if and only if the {@link TimerTask} associated
* with this handle has been cancelled.
*/
boolean isCancelled();
/**
* Attempts to cancel the {@link TimerTask} associated with this handle.
* If the task has been executed or cancelled already, it will return with
* no side effect.
*
* @return True if the cancellation completed successfully, otherwise false
*/
boolean cancel();
}
Timeout代表的是对一次任务的处理。timer方法返回的就是创建这个Timeout的Timer对象,task返回的是这个Timeout处理的任务,isExpired代表的是这个任务是否已经超过它预设的时间,isCancelled是返回是否已取消此任务,cancel则是取消此任务。
以上者几个接口就从逻辑上构成了一个任务调度器系统。我们从各个接口的入参和返回值可以看出,这几个接口设计的很巧妙,往往是某个类创建了另一个类的对象,然后它创建的对象又可以通过方法获取到创建它的对象。这种设计方式在spring框架中也是经常出现的。可以看出在设计一个复杂的系统时这是一种很有效的方式。可以学习一下。
下面就开始看本文的重点,时间轮调度器的实现HashedWheelTimer。首先是类头:
/**
* A {@link Timer} optimized for approximated I/O timeout scheduling.
*
* <h3>Tick Duration</h3>
* <p>
* As described with 'approximated', this timer does not execute the scheduled
* {@link TimerTask} on time. {@link HashedWheelTimer}, on every tick, will
* check if there are any {@link TimerTask}s behind the schedule and execute
* them.
* <p>
* You can increase or decrease the accuracy of the execution timing by
* specifying smaller or larger tick duration in the constructor. In most
* network applications, I/O timeout does not need to be accurate. Therefore,
* the default tick duration is 100 milliseconds and you will not need to try
* different configurations in most cases.
*
* <h3>Ticks per Wheel (Wheel Size)</h3>
* <p>
* {@link HashedWheelTimer} maintains a data structure called 'wheel'.
* To put simply, a wheel is a hash table of {@link TimerTask}s whose hash
* function is 'dead line of the task'. The default number of ticks per wheel
* (i.e. the size of the wheel) is 512. You could specify a larger value
* if you are going to schedule a lot of timeouts.
*
* <h3>Do not create many instances.</h3>
* <p>
* {@link HashedWheelTimer} creates a new thread whenever it is instantiated and
* started. Therefore, you should make sure to create only one instance and
* share it across your application. One of the common mistakes, that makes
* your application unresponsive, is to create a new instance for every connection.
*
* <h3>Implementation Details</h3>
* <p>
* {@link HashedWheelTimer} is based on
* <a href="http://cseweb.ucsd.edu/users/varghese/">George Varghese</a> and
* Tony Lauck's paper,
* <a href="http://cseweb.ucsd.edu/users/varghese/PAPERS/twheel.ps.Z">'Hashed
* and Hierarchical Timing Wheels: data structures to efficiently implement a
* timer facility'</a>. More comprehensive slides are located
* <a href="http://www.cse.wustl.edu/~cdgill/courses/cs6874/TimingWheels.ppt">here</a>.
*/
public class HashedWheelTimer implements Timer {
从注释可以看出,该类并不提供准确的定时执行任务的功能,也就是不能指定几点几分几秒准时执行某个任务,而是在每个tick(也就是时间轮的一个“时间槽”)中,检测是否存在TimerTask已经落后于当前时间,如果是则执行它。(相信了解了时间轮算法的同学,应该是很容易理解这段话的意思的。)我们可以通过设定更小或更大的tick duration(时间槽的持续时间),来提高或降低执行时间的准确率。这句话也很好理解,比如我一个时间槽有1秒,和一个时间槽是5秒,那准确度相差5倍。注释继续说,在大多数网络应用程序中,IO超时不必须是准确的,也就是比如说我要求5秒就超时,那框架不是说必须要在5秒刚好超时的那个点告诉我超时,也可以稍微晚一点点也无所谓。因此,默认的tick duration是100毫秒,我们在大多数场景下并不需要修改它。
这个类维护了一种称为“wheel”的数据结构,也就是我们说的时间轮。简单地说,一个wheel就是一个hash table,它的hash函数是任务的截止时间,也就是我们要通过hash函数把这个任务放到它应该在的时间槽中,这样随着时间的推移,当我们进入某个时间槽中时,这个槽中的任务也刚好到了它该执行的时间。这样就避免了在每一个槽中都需要检测所有任务是否需要执行。默认的时间槽的数量是512,如果我们需要调度非常多的任务,我们可以自定义这个值。
这个类在系统中只需要创建一个实例,因为它在每次被初始化并开始运行的时候,会创建一个新的线程。一个常见的使用错误是,对每个连接(这里应该是Netty中的注释,因为这个类主要用在处理连接,这里的连接可以理解为任务)都创建一个这个类,这将导致应用程序变得不可响应(开的线程太多)。
下面就是介绍这个类的实现原理依据的论文,就不看了。下面直接看代码。首先是field。
/**
* may be in spi?
*/
public static final String NAME = "hased";
private static final Logger logger = LoggerFactory.getLogger(HashedWheelTimer.class);
// 实例计数器,用于记录创建了多少个本类的对象
private static final AtomicInteger INSTANCE_COUNTER = new AtomicInteger();
// 用于对象数超过限制时的告警
private static final AtomicBoolean WARNED_TOO_MANY_INSTANCES = new AtomicBoolean();
// 实例上限
private static final int INSTANCE_COUNT_LIMIT = 64;
// 原子化更新workState变量的工具
private static final AtomicIntegerFieldUpdater<HashedWheelTimer> WORKER_STATE_UPDATER =
AtomicIntegerFieldUpdater.newUpdater(HashedWheelTimer.class, "workerState");
// 推动时间轮运转的执行类
private final Worker worker = new Worker();
// 绑定的执行线程
private final Thread workerThread;
// WORKER初始化状态
private static final int WORKER_STATE_INIT = 0;
// WORKER已开始状态
private static final int WORKER_STATE_STARTED = 1;
// WORKER已停止状态
private static final int WORKER_STATE_SHUTDOWN = 2;
/**
* 0 - init, 1 - started, 2 - shut down
*/
@SuppressWarnings({"unused", "FieldMayBeFinal"})
private volatile int workerState;
// 时间槽持续时间
private final long tickDuration;
// 时间槽数组
private final HashedWheelBucket[] wheel;
// 计算任务应该放到哪个时间槽时使用的掩码
private final int mask;
// 线程任务同步工具
private final CountDownLatch startTimeInitialized = new CountDownLatch(1);
// 保存任务调度的队列
private final Queue<HashedWheelTimeout> timeouts = new LinkedBlockingQueue<>();
// 已取消的任务调度队列
private final Queue<HashedWheelTimeout> cancelledTimeouts = new LinkedBlockingQueue<>();
// 等待中的任务调度数量
private final AtomicLong pendingTimeouts = new AtomicLong(0);
// 最大等待任务调度数量
private final long maxPendingTimeouts;
// 时间轮的初始时间
private volatile long startTime;
可能有部分参数的作用看不太懂,结合下文就可以看懂了。首先就看一下这个方法的构造器吧。
/**
* Creates a new timer.
*
* @param threadFactory a {@link ThreadFactory} that creates a
* background {@link Thread} which is dedicated to
* {@link TimerTask} execution.
* @param tickDuration the duration between tick
* @param unit the time unit of the {@code tickDuration}
* @param ticksPerWheel the size of the wheel
* @param maxPendingTimeouts The maximum number of pending timeouts after which call to
* {@code newTimeout} will result in
* {@link java.util.concurrent.RejectedExecutionException}
* being thrown. No maximum pending timeouts limit is assumed if
* this value is 0 or negative.
* @throws NullPointerException if either of {@code threadFactory} and {@code unit} is {@code null}
* @throws IllegalArgumentException if either of {@code tickDuration} and {@code ticksPerWheel} is <= 0
*/
public HashedWheelTimer(
ThreadFactory threadFactory,
long tickDuration, TimeUnit unit, int ticksPerWheel,
long maxPendingTimeouts) {
if (threadFactory == null) {
throw new NullPointerException("threadFactory");
}
if (unit == null) {
throw new NullPointerException("unit");
}
if (tickDuration <= 0) {
throw new IllegalArgumentException("tickDuration must be greater than 0: " + tickDuration);
}
if (ticksPerWheel <= 0) {
throw new IllegalArgumentException("ticksPerWheel must be greater than 0: " + ticksPerWheel);
}
// Normalize ticksPerWheel to power of two and initialize the wheel.
wheel = createWheel(ticksPerWheel);
mask = wheel.length - 1;
// Convert tickDuration to nanos.
this.tickDuration = unit.toNanos(tickDuration);
// Prevent overflow.
if (this.tickDuration >= Long.MAX_VALUE / wheel.length) {
throw new IllegalArgumentException(String.format(
"tickDuration: %d (expected: 0 < tickDuration in nanos < %d",
tickDuration, Long.MAX_VALUE / wheel.length));
}
workerThread = threadFactory.newThread(worker);
this.maxPendingTimeouts = maxPendingTimeouts;
if (INSTANCE_COUNTER.incrementAndGet() > INSTANCE_COUNT_LIMIT &&
WARNED_TOO_MANY_INSTANCES.compareAndSet(false, true)) {
reportTooManyInstances();
}
}
参数的英文注释不再翻译。看主要逻辑,
1.首先是校验了参数
2.很关键的创建时间轮,也就是初始化下面上面提到的wheel这个数组,因为这个数组就是代表hash表的数组。
3.初始化了mask这个掩码,它的值为wheel.length - 1,初始化为这个值是为了计算方便,后面会说到。
4.之后初始化了时间槽持续时间。并进行了溢出判断,即如果Long类型的最大值除以时间槽的个数,得出的结果小于传入的时间槽设定时间,会抛异常。
5.设定最大等待任务调度数
6.判断对象数量是否超过最大限制,若超过则报告。
下面展开上面的createWheel方法
private static HashedWheelBucket[] createWheel(int ticksPerWheel) {
if (ticksPerWheel <= 0) {
throw new IllegalArgumentException(
"ticksPerWheel must be greater than 0: " + ticksPerWheel);
}
if (ticksPerWheel > 1073741824) {
throw new IllegalArgumentException(
"ticksPerWheel may not be greater than 2^30: " + ticksPerWheel);
}
ticksPerWheel = normalizeTicksPerWheel(ticksPerWheel);
HashedWheelBucket[] wheel = new HashedWheelBucket[ticksPerWheel];
for (int i = 0; i < wheel.length; i++) {
wheel[i] = new HashedWheelBucket();
}
return wheel;
}
忽略基本的参数校验,看主要流程
1.对时间槽数量进行规范化处理
2.创建时间槽数组
3.初始化时间槽数组的每个参数
对时间槽数量的规范化处理
private static int normalizeTicksPerWheel(int ticksPerWheel) {
int normalizedTicksPerWheel = ticksPerWheel - 1;
normalizedTicksPerWheel |= normalizedTicksPerWheel >>> 1;
normalizedTicksPerWheel |= normalizedTicksPerWheel >>> 2;
normalizedTicksPerWheel |= normalizedTicksPerWheel >>> 4;
normalizedTicksPerWheel |= normalizedTicksPerWheel >>> 8;
normalizedTicksPerWheel |= normalizedTicksPerWheel >>> 16;
return normalizedTicksPerWheel + 1;
}
假设输入的值是37,计算之后返回的结果为64,可以看出此方法的作用在于,将传入的参数修改为大于等于它的最小的2的次幂。
HashedWheelBucket这个类就是时间槽(也可以叫桶,Bucket,一个意思)。构造它使用的是默认构造函数。对于它的实现,后面再分析。
newTimeout方法
@Override
public Timeout newTimeout(TimerTask task, long delay, TimeUnit unit) {
if (task == null) {
throw new NullPointerException("task");
}
if (unit == null) {
throw new NullPointerException("unit");
}
long pendingTimeoutsCount = pendingTimeouts.incrementAndGet();
if (maxPendingTimeouts > 0 && pendingTimeoutsCount > maxPendingTimeouts) {
pendingTimeouts.decrementAndGet();
throw new RejectedExecutionException("Number of pending timeouts ("
+ pendingTimeoutsCount + ") is greater than or equal to maximum allowed pending "
+ "timeouts (" + maxPendingTimeouts + ")");
}
start();
// Add the timeout to the timeout queue which will be processed on the next tick.
// During processing all the queued HashedWheelTimeouts will be added to the correct HashedWheelBucket.
long deadline = System.nanoTime() + unit.toNanos(delay) - startTime;
// Guard against overflow.
if (delay > 0 && deadline < 0) {
deadline = Long.MAX_VALUE;
}
HashedWheelTimeout timeout = new HashedWheelTimeout(this, task, deadline);
timeouts.add(timeout);
return timeout;
}
这个方法就是向调度器添加一个待执行任务。忽略基本参数校验,主要流程:
1.将等待任务调度数加1,若等待数量超过最大限制,则减1并抛异常
2.启动时间轮(并不是每次都启动,只会启动一次,start方法里会有判断,后面再看)
3.计算当前任务的截止时间(也就是要执行的时间),并进行防溢出处理
4.构造一个Timeout,并放入等待任务调度队列中
start方法
/**
* Starts the background thread explicitly. The background thread will
* start automatically on demand even if you did not call this method.
*
* @throws IllegalStateException if this timer has been
* {@linkplain #stop() stopped} already
*/
public void start() {
switch (WORKER_STATE_UPDATER.get(this)) {
case WORKER_STATE_INIT:
if (WORKER_STATE_UPDATER.compareAndSet(this, WORKER_STATE_INIT, WORKER_STATE_STARTED)) {
workerThread.start();
}
break;
case WORKER_STATE_STARTED:
break;
case WORKER_STATE_SHUTDOWN:
throw new IllegalStateException("cannot be started once stopped");
default:
throw new Error("Invalid WorkerState");
}
// Wait until the startTime is initialized by the worker.
while (startTime == 0) {
try {
startTimeInitialized.await();
} catch (InterruptedException ignore) {
// Ignore - it will be ready very soon.
}
}
}
1.获取WORKER运行状态,若是初始化,则更新到已启动状态,并启动workThread线程,若是其他状态,做相应处理
2.若startTime==0,则在此线程中等待workThread将startTime初始化完成
此方法也很简单,就是启动定时器背后的执行线程,同时利用CountLatchDown等待startTime初始化为0,这里为什么要等待为0呢?答案就是上面的newTimeout方法中,在start之后会用到这个startTime,如果它没有初始化完成的化,计算会有问题。
到此为止,利用HashedWheelTimer添加一个待执行任务的主体流程已经完成。下面再看一下时间轮内部是如何运转的。下面先看Worker这个类
Worker
fields
private final Set<Timeout> unprocessedTimeouts = new HashSet<Timeout>();
private long tick;
第一个集合参数是没有处理的任务调度集合,第二个参数是当前执行的tick(也就是当前执行到哪个时间槽了)。
run方法
@Override
public void run() {
// Initialize the startTime.
startTime = System.nanoTime();
if (startTime == 0) {
// We use 0 as an indicator for the uninitialized value here, so make sure it's not 0 when initialized.
startTime = 1;
}
// Notify the other threads waiting for the initialization at start().
startTimeInitialized.countDown();
do {
final long deadline = waitForNextTick();
if (deadline > 0) {
int idx = (int) (tick & mask);
processCancelledTasks();
HashedWheelBucket bucket =
wheel[idx];
transferTimeoutsToBuckets();
bucket.expireTimeouts(deadline);
tick++;
}
} while (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_STARTED);
// Fill the unprocessedTimeouts so we can return them from stop() method.
for (HashedWheelBucket bucket : wheel) {
bucket.clearTimeouts(unprocessedTimeouts);
}
for (; ; ) {
HashedWheelTimeout timeout = timeouts.poll();
if (timeout == null) {
break;
}
if (!timeout.isCancelled()) {
unprocessedTimeouts.add(timeout);
}
}
processCancelledTasks();
}
主要逻辑
1.初始化startTime,如果startTime为0,则初始化为1。这里为什么要判断是否为0呢?我们知道,java中获取当前时间有两种方法,一个是System.currentTimeMillis()它返回的是国际通用时间UTC中,距离1970年1月1日零点之间的毫秒数。另一个就是这里用的System.nanoTime(),它返回的是当前时间距离虚拟机中某个固定时间点之间的时间差,单位为毫微秒,但这个固定时间每台虚拟机都不一样,所以它只能用于计算时间差。回到上面这个方法,如果执行nanoTime的时刻刚好是这个固定时间,丝毫不差,那返回值就是0。所以这里为了防止不知道多少分之一的可能性,需要判断一下是否为0。
2.因为startTime已经初始化完成,所以startLatchDown通知等待的线程,可以继续执行了。
3.接下来是一个for循环,当定时器一直是已启动的状态时,不断地推进tick前进。推进的过程:
1)等待下一个tick的到来
2)tick到来之后,计算tick对应时间槽数组中的那个槽(这里tick&mask,就相当于对时间槽数组的长度取模运算)
3)处理已取消任务调度队列
4)获取当前时间槽,并将待处理任务队列中的任务放到它们应该放的槽中
5)当前时间槽执行它包含的任务
4.若时间轮已被停止,则执行下列流程:
1)清理所有时间槽中的未处理任务调度
2)清理待处理任务调度队列,将未取消的加入到未处理集合中
3)处理已取消的任务调度队列
waitForNextTick方法
/**
* calculate goal nanoTime from startTime and current tick number,
* then wait until that goal has been reached.
*
* @return Long.MIN_VALUE if received a shutdown request,
* current time otherwise (with Long.MIN_VALUE changed by +1)
*/
private long waitForNextTick() {
long deadline = tickDuration * (tick + 1);
for (; ; ) {
final long currentTime = System.nanoTime() - startTime;
long sleepTimeMs = (deadline - currentTime + 999999) / 1000000;
if (sleepTimeMs <= 0) {
if (currentTime == Long.MIN_VALUE) {
return -Long.MAX_VALUE;
} else {
return currentTime;
}
}
if (isWindows()) {
sleepTimeMs = sleepTimeMs / 10 * 10;
}
try {
Thread.sleep(sleepTimeMs);
} catch (InterruptedException ignored) {
if (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_SHUTDOWN) {
return Long.MIN_VALUE;
}
}
}
}
Set<Timeout> unprocessedTimeouts() {
return Collections.unmodifiableSet(unprocessedTimeouts);
}
}
主要流程:
1.计算下一个tick的开始时间
2.循环等待直到时间到达下一个tick开始时间,这里sleepTimeMs <= 0,等价于deadline - currentTime <= -999999(毫微秒),也就是说当前时间超过下一个tick 999999毫微秒了,才到时间。这里就会返回了
3.计算一个睡眠时间,然后线程睡眠一下。
processCancelledTasks方法
private void processCancelledTasks() {
for (; ; ) {
HashedWheelTimeout timeout = cancelledTimeouts.poll();
if (timeout == null) {
// all processed
break;
}
try {
timeout.remove();
} catch (Throwable t) {
if (logger.isWarnEnabled()) {
logger.warn("An exception was thrown while process a cancellation task", t);
}
}
}
}
for循环:
1.从已取消队列中取出第一个被取消的任务调度
2.调用HashedWheelTimeout的remove方法进行移除,这个方法后面再看
transferTimeoutsToBuckets方法
private void transferTimeoutsToBuckets() {
// transfer only max. 100000 timeouts per tick to prevent a thread to stale the workerThread when it just
// adds new timeouts in a loop.
for (int i = 0; i < 100000; i++) {
HashedWheelTimeout timeout = timeouts.poll();
if (timeout == null) {
// all processed
break;
}
if (timeout.state() == HashedWheelTimeout.ST_CANCELLED) {
// Was cancelled in the meantime.
continue;
}
long calculated = timeout.deadline / tickDuration;
timeout.remainingRounds = (calculated - tick) / wheel.length;
// Ensure we don't schedule for past.
final long ticks = Math.max(calculated, tick);
int stopIndex = (int) (ticks & mask);
HashedWheelBucket bucket = wheel[stopIndex];
bucket.addTimeout(timeout);
}
}
for循环10000次(这里只循环有限次,是为了防止待处理队列过大,导致这一次添加到对应槽的过程太过耗时):
1.从待处理任务调度队列中取出第一个任务,进行校验
2.根据取出的待处理任务调度,计算出一个槽
3.设置此任务调度的remaininRounds(剩余圈数),因为时间轮是一个轮,所以可能会有还需要过几圈的时间才能执行到的任务
4.取计算出的槽和当前槽中的较大者,并进行取模
5.将此任务调度加入对应的槽中
上面已经介绍完了时间槽运转的主体流程。相信大家还有很多不明白的地方,下面再介绍一下HashedWheelTimeout和HashedWheelBucket这两个类。首先是HashedWheelTimeout这个类。
HashedWheelTimeout类
private static final class HashedWheelTimeout implements Timeout {
// 初始化状态
private static final int ST_INIT = 0;
// 已取消状态
private static final int ST_CANCELLED = 1;
// 已超时状态
private static final int ST_EXPIRED = 2;
// state属性获取器
private static final AtomicIntegerFieldUpdater<HashedWheelTimeout> STATE_UPDATER =
AtomicIntegerFieldUpdater.newUpdater(HashedWheelTimeout.class, "state");
// 调度器
private final HashedWheelTimer timer;
// 调度任务
private final TimerTask task;
// 截止时间
private final long deadline;
@SuppressWarnings({"unused", "FieldMayBeFinal", "RedundantFieldInitialization"})
private volatile int state = ST_INIT;
/**
* RemainingRounds will be calculated and set by Worker.transferTimeoutsToBuckets() before the
* HashedWheelTimeout will be added to the correct HashedWheelBucket.
*/
long remainingRounds;
/**
* This will be used to chain timeouts in HashedWheelTimerBucket via a double-linked-list.
* As only the workerThread will act on it there is no need for synchronization / volatile.
*/
HashedWheelTimeout next;
HashedWheelTimeout prev;
/**
* The bucket to which the timeout was added
*/
HashedWheelBucket bucket;
HashedWheelTimeout(HashedWheelTimer timer, TimerTask task, long deadline) {
this.timer = timer;
this.task = task;
this.deadline = deadline;
}
@Override
public Timer timer() {
return timer;
}
@Override
public TimerTask task() {
return task;
}
@Override
public boolean cancel() {
// only update the state it will be removed from HashedWheelBucket on next tick.
if (!compareAndSetState(ST_INIT, ST_CANCELLED)) {
return false;
}
// If a task should be canceled we put this to another queue which will be processed on each tick.
// So this means that we will have a GC latency of max. 1 tick duration which is good enough. This way
// we can make again use of our MpscLinkedQueue and so minimize the locking / overhead as much as possible.
timer.cancelledTimeouts.add(this);
return true;
}
void remove() {
HashedWheelBucket bucket = this.bucket;
if (bucket != null) {
bucket.remove(this);
} else {
timer.pendingTimeouts.decrementAndGet();
}
}
public boolean compareAndSetState(int expected, int state) {
return STATE_UPDATER.compareAndSet(this, expected, state);
}
public int state() {
return state;
}
@Override
public boolean isCancelled() {
return state() == ST_CANCELLED;
}
@Override
public boolean isExpired() {
return state() == ST_EXPIRED;
}
public void expire() {
if (!compareAndSetState(ST_INIT, ST_EXPIRED)) {
return;
}
try {
task.run(this);
} catch (Throwable t) {
if (logger.isWarnEnabled()) {
logger.warn("An exception was thrown by " + TimerTask.class.getSimpleName() + '.', t);
}
}
}
@Override
public String toString() {
final long currentTime = System.nanoTime();
long remaining = deadline - currentTime + timer.startTime;
String simpleClassName = ClassUtils.simpleClassName(this.getClass());
StringBuilder buf = new StringBuilder(192)
.append(simpleClassName)
.append('(')
.append("deadline: ");
if (remaining > 0) {
buf.append(remaining)
.append(" ns later");
} else if (remaining < 0) {
buf.append(-remaining)
.append(" ns ago");
} else {
buf.append("now");
}
if (isCancelled()) {
buf.append(", cancelled");
}
return buf.append(", task: ")
.append(task())
.append(')')
.toString();
}
}
可以看到,这个类逻辑比较简单,基本都是赋值或读取值的操作,或者是委托给HashedWheelBucket这个类进行操作,就不做过多介绍,大家可以自行学习。需要注意的一点是next何prev这两个属性,我们知道,一个HashedWheelBucket会挂载多个HashedWheelTimeout,这个next和prev就是用于实现一个双向链表的结构,这样同属于一个HashedWheelBucket的HashedWheelTimeout就可以以双向链表的形式挂载在HashedWheelBucket上了。
HashedWheelBucket
fields:
/**
* Used for the linked-list datastructure
*/
private HashedWheelTimeout head;
private HashedWheelTimeout tail;
如上文所述,这两个参数就是HashedWheelTimeout双向链表的头尾指针。
addTimeout和remove方法
/**
* Add {@link HashedWheelTimeout} to this bucket.
*/
void addTimeout(HashedWheelTimeout timeout) {
assert timeout.bucket == null;
timeout.bucket = this;
if (head == null) {
head = tail = timeout;
} else {
tail.next = timeout;
timeout.prev = tail;
tail = timeout;
}
}
public HashedWheelTimeout remove(HashedWheelTimeout timeout) {
HashedWheelTimeout next = timeout.next;
// remove timeout that was either processed or cancelled by updating the linked-list
if (timeout.prev != null) {
timeout.prev.next = next;
}
if (timeout.next != null) {
timeout.next.prev = timeout.prev;
}
if (timeout == head) {
// if timeout is also the tail we need to adjust the entry too
if (timeout == tail) {
tail = null;
head = null;
} else {
head = next;
}
} else if (timeout == tail) {
// if the timeout is the tail modify the tail to be the prev node.
tail = timeout.prev;
}
// null out prev, next and bucket to allow for GC.
timeout.prev = null;
timeout.next = null;
timeout.bucket = null;
timeout.timer.pendingTimeouts.decrementAndGet();
return next;
}
这两个方法平平无奇,就是双向链表的添加和删除操作。
expireTimeouts方法
/**
* Expire all {@link HashedWheelTimeout}s for the given {@code deadline}.
*/
void expireTimeouts(long deadline) {
HashedWheelTimeout timeout = head;
// process all timeouts
while (timeout != null) {
HashedWheelTimeout next = timeout.next;
if (timeout.remainingRounds <= 0) {
next = remove(timeout);
if (timeout.deadline <= deadline) {
timeout.expire();
} else {
// The timeout was placed into a wrong slot. This should never happen.
throw new IllegalStateException(String.format(
"timeout.deadline (%d) > deadline (%d)", timeout.deadline, deadline));
}
} else if (timeout.isCancelled()) {
next = remove(timeout);
} else {
timeout.remainingRounds--;
}
timeout = next;
}
}
这个方法就是实际将一个时间槽中所有挂载的任务调度执行的方法。可以看出逻辑也比较简单,就是从头遍历timeout的双向链表,对每一个timeout进行处理,处理的流程就是,先判断剩余圈数是否小于等于0,如果是,再判断它的截止时间是否小于当前截止时间,如果小於则进行expire,实际也就是包含了执行这个任务的操作。主要逻辑就是这个,其他次要逻辑就不说了,相信看一下就能明白
总结
以上介绍了Dubbo中的时间轮定时器的原理和实现,它主要是通过Timer,Timeout,TimerTask几个接口定义了一个定时器的模型,再通过HashedWheelTimer这个类(包括其内部类)实现了一个时间轮定时器。它对外提供了简单易用的接口,只需要调用newTimer接口,就可以实现对只需执行一次任务的调度。通过该定时器,Dubbo在响应的场景中实现了高效的任务调度。