前言
前不久將一個quartz的單點系統改爲擁有多個節點的系統,使用的方案也是quartz scheduler(mysql)的方案,因爲確實簡單好用,本文主要記錄一些功能實現和踩的坑,重點是談談如何編寫任務,如何中斷恢復,以及避免重啓導致任務出現問題。
問題記錄
- job序列化的坑(Spring MethodInvoker)
項目中使用MethodInvokingJobDetailFactoryBean配置很多近百個任務,使用xml的形式,但是quartz分佈式形式後,啓動就報NotSerializableException,爲什麼呢?看一下源碼,jobDataMap中放了methodInvoker指向this,這是spring上下文的一個bean,肯定無法序列化,也就不能落到數據庫了。而且MethodInvokingJobDetailFactoryBean中註釋也明確寫了不支持持久化,如果需要,自己定製開發一個。
//MethodInvokingJobDetailFactoryBean#afterPropertiesSet
@Override
public void afterPropertiesSet() throws ClassNotFoundException, NoSuchMethodException {
prepare();
// Use specific name if given, else fall back to bean name.
String name = (this.name != null ? this.name : this.beanName);
// Consider the concurrent flag to choose between stateful and stateless job.
Class<? extends Job> jobClass = (this.concurrent ? MethodInvokingJob.class : StatefulMethodInvokingJob.class);
// Build JobDetail instance.
JobDetailImpl jdi = new JobDetailImpl();
jdi.setName(name != null ? name : toString());
jdi.setGroup(this.group);
//jobClass實際上是內部類MethodInvokingJob或StatefulMethodInvokingJob
jdi.setJobClass(jobClass);
jdi.setDurability(true);
//放一個bean,內部類MethodInvokingJob調用
jdi.getJobDataMap().put("methodInvoker", this);
this.jobDetail = jdi;
postProcessJobDetail(this.jobDetail);
}
public static class MethodInvokingJob extends QuartzJobBean {...}
@DisallowConcurrentExecution
public static class StatefulMethodInvokingJob extends MethodInvokingJob {
}
那就依葫蘆畫瓢寫一個,增加了中斷恢復,還有代理bean的處理。
public class CustomizedMethodInvokingJobDetailFactoryBean extends ArgumentConvertingMethodInvoker
implements FactoryBean<JobDetail>, BeanNameAware, BeanClassLoaderAware, BeanFactoryAware, InitializingBean, ApplicationContextAware {
//記錄類名對應的代理bean
private static final ConcurrentHashMap<String, Object> realClassName2ProxyObject = new ConcurrentHashMap<>();
private static final Logger LOG = LoggerFactory.getLogger(CustomizedMethodInvokingJobDetailFactoryBean.class);
@Nullable
private String name;
private String group = Scheduler.DEFAULT_GROUP;
private boolean concurrent = true;
@Nullable
private String targetBeanName;
@Nullable
private String beanName;
@Nullable
private ClassLoader beanClassLoader = ClassUtils.getDefaultClassLoader();
@Nullable
private BeanFactory beanFactory;
@Nullable
private JobDetail jobDetail;
/**
* 被中斷是否恢復
* 中斷與否是根據數據庫表的記錄來確定的,若使用此屬性請做好冪等
*/
private boolean requestsRecovery = false;
private static ApplicationContext applicationContext;
public void setName(String name) {
this.name = name;
}
public void setGroup(String group) {
this.group = group;
}
public void setRequestsRecovery(boolean requestsRecovery) {
this.requestsRecovery = requestsRecovery;
}
/**
* 任務是否可併發執行
*/
public void setConcurrent(boolean concurrent) {
this.concurrent = concurrent;
}
public void setTargetBeanName(String targetBeanName) {
this.targetBeanName = targetBeanName;
}
@Override
public void setBeanName(String beanName) {
this.beanName = beanName;
}
@Override
public void setBeanClassLoader(ClassLoader classLoader) {
this.beanClassLoader = classLoader;
}
@Override
public void setBeanFactory(BeanFactory beanFactory) {
this.beanFactory = beanFactory;
}
@Override
protected Class<?> resolveClassName(String className) throws ClassNotFoundException {
return ClassUtils.forName(className, this.beanClassLoader);
}
@Override
public void afterPropertiesSet() throws ClassNotFoundException, NoSuchMethodException {
prepare();
// Use specific name if given, else fall back to bean name.
String name = (this.name != null ? this.name : this.beanName);
// Consider the concurrent flag to choose between stateful and stateless job.
Class<? extends Job> jobClass = (this.concurrent ? BeanInvokingJob.class : StatefulBeanInvokingJob.class);
// Build JobDetail instance.
JobDetailImpl jdi = new JobDetailImpl();
jdi.setName(name != null ? name : toString());
jdi.setGroup(this.group);
jdi.setJobClass(jobClass);
jdi.setDurability(true);
jdi.setRequestsRecovery(this.requestsRecovery);
try {
LOG.info("targetObject類名稱:{}", this.getTargetObject().getClass().getName());
Object realObject = AopTargetUtils.getTarget(this.getTargetObject());
jdi.getJobDataMap().put("targetClass", realObject.getClass().getName());
} catch (Exception e) {
LOG.error("獲取真實類出錯{}:{}", name, e);
jdi.getJobDataMap().put("targetClass", ClassUtils.getUserClass(this.getTargetObject()).getName());
}
String targetClass = jdi.getJobDataMap().getString("targetClass");
//保留真實類名和 bean 之間的關係
if (realClassName2ProxyObject.contains(targetClass)) {
LOG.error("目標類:{}有多個bean/代理bean", targetClass);
} else {
LOG.info("記錄targetClass:{} targetObject:{}", targetClass, this.getTargetObject());
realClassName2ProxyObject.put(targetClass, this.getTargetObject());
}
jdi.getJobDataMap().put("targetMethod", this.getTargetMethod());
this.jobDetail = jdi;
postProcessJobDetail(this.jobDetail);
}
protected void postProcessJobDetail(JobDetail jobDetail) {
}
@Override
public Class<?> getTargetClass() {
Class<?> targetClass = super.getTargetClass();
if (targetClass == null && this.targetBeanName != null) {
Assert.state(this.beanFactory != null, "BeanFactory must be set when using 'targetBeanName'");
targetClass = this.beanFactory.getType(this.targetBeanName);
}
return targetClass;
}
@Override
public Object getTargetObject() {
Object targetObject = super.getTargetObject();
if (targetObject == null && this.targetBeanName != null) {
Assert.state(this.beanFactory != null, "BeanFactory must be set when using 'targetBeanName'");
targetObject = this.beanFactory.getBean(this.targetBeanName);
}
return targetObject;
}
@Override
@Nullable
public JobDetail getObject() {
return this.jobDetail;
}
@Override
public Class<? extends JobDetail> getObjectType() {
return (this.jobDetail != null ? this.jobDetail.getClass() : JobDetail.class);
}
@Override
public boolean isSingleton() {
return true;
}
@Override
public void setApplicationContext(ApplicationContext context) throws BeansException {
applicationContext = context;
}
public static class BeanInvokingJob implements Job {
@Override
public void execute(JobExecutionContext context) throws JobExecutionException {
try {
LOG.info("start");
String targetClass = context.getMergedJobDataMap().getString("targetClass");
Class clazz = Class.forName(targetClass);
String targetMethod = context.getMergedJobDataMap().getString("targetMethod");
if (targetMethod == null) {
throw new JobExecutionException("targetMethod cannot be null.", false);
}
Object argumentsObject = context.getMergedJobDataMap().get("arguments");
Object[] arguments = (argumentsObject instanceof String) ? null : (Object[]) argumentsObject;
Object bean = applicationContext.getBean(clazz);
if (realClassName2ProxyObject.contains(targetClass)) {
//獲取代理類
bean = realClassName2ProxyObject.get(targetClass);
}
MethodInvoker beanMethod = new MethodInvoker();
beanMethod.setTargetObject(bean);
beanMethod.setTargetMethod(targetMethod);
beanMethod.setArguments(arguments);
beanMethod.prepare();
LOG.info("Invoking Bean: {} ; Method: {}", clazz, targetMethod);
beanMethod.invoke();
} catch (JobExecutionException e) {
throw e;
} catch (Exception e) {
throw new JobExecutionException(e);
} finally {
LOG.info("end");
}
}
}
@DisallowConcurrentExecution
public static class StatefulBeanInvokingJob extends BeanInvokingJob {}
}
- 動態代理問題
如果method invoke形式的任務對於的方法有切面,如日誌,事務等,需要調用代理bean,上文的自定義類已經兼容兩種代理方式 - misfire問題(錯過執行)
misfire有很多文章講的很棒,其實就是由於各種原因錯過執行,以及補償策略,分SimpleTrigger和CronTrigger,CronTringger情況下如果任務不允許併發,設置爲MISFIRE_INSTRUCTION_DO_NOTHING即可。注意配置org.quartz.jobStore.misfireThreshold,用來限定多久算錯過任務。 - 安全重啓(scheduler shutdown)
這部分主要在下文討論,測試中發現了一些問題,雖然quartz scheduler有waitForJobsToCompleteOnShutdown,也就是停止的時候等待任務執行完成,但是和spring集成似乎有問題,導致並不能很好等待任務執行完成。如果使用自定義的線程池,會出現另一個問題:任務被完成後,需要修改數據庫,scheduler已經停止了,java.lang.IllegalStateException: JobStore is shutdown
。 後面調了一下,可以了,使用quartz默認線程池,使用spring的SchedulerFactoryBean,是可以滿足任務執行完再停止。前面之所以有問題,是因爲把quartzScheduler註冊到spring中了,在上下文銷燬時,發生了兩次scheduler的銷燬,細節再分析。 - 任務禁止併發執行
使用@DisallowConcurrentExecution
註解,上文自定義類中包含的有。需要注意的是,任務禁止併發在分佈式環境下有效:
- 即使存在misfire補償也有效
- 即使存在手動觸發任務也有效,前提是使用quartz API手動觸發
- 即使存在任務中斷恢復也有效
- 任務失敗恢復
上文自定義類有requestsRecovery屬性,不過需要注意的是,任務拋異常也被認爲是正常完成,失敗恢復其實是根據數據庫表qrtz_fired_triggers中的記錄來實現的。 - 手動觸發任務
使用scheduler.triggerJob(jobKey);
觸發一次調度,但是不一定會立刻執行。
重點:編寫定時任務的一些想法,純討論
- 短時間任務+高頻次調度 VS 長時間任務+低頻調度
鑑於quartz的線程池模型,一個長時間執行的任務是一種不友好的做法,而且長時間執行的任務在中斷恢復,安全退出等方面都不太容易處理,短時間任務可能是個更好的做法。 - 任務異常處理
應該任務中捕獲並處理幾乎所有的異常,因爲拋給任務調度平臺它也不知道怎麼辦。 - 怎樣安全停機,對中斷做出響應
- 如果所有的任務時間短,可以設置爲完成任務後再shutdown,spring SchedulerFactoryBean設置屬性waitForJobsToCompleteOnShutdown爲true,並且使用quartz自己的線程池。
@Override
public void destroy() throws SchedulerException {
if (this.scheduler != null) {
logger.info("Shutting down Quartz Scheduler");
this.scheduler.shutdown(this.waitForJobsToCompleteOnShutdown);
}
}
public void shutdown(boolean waitForJobsToComplete) {
//...
schedThread.halt(waitForJobsToComplete);
notifySchedulerListenersShuttingdown();
if( (resources.isInterruptJobsOnShutdown() && !waitForJobsToComplete) ||
(resources.isInterruptJobsOnShutdownWithWait() && waitForJobsToComplete)) {
List<JobExecutionContext> jobs = getCurrentlyExecutingJobs();
for(JobExecutionContext job: jobs) {
if(job.getJobInstance() instanceof InterruptableJob)
try {
((InterruptableJob)job.getJobInstance()).interrupt();
} catch (Throwable e) {
// do nothing, this was just a courtesy effort
getLog().warn("Encountered error when interrupting job {} during shutdown: {}", job.getJobDetail().getKey(), e);
}
}
}
//如果自定義線程池,這裏啥也不做
resources.getThreadPool().shutdown(waitForJobsToComplete);
closed = true;
//...
}
- 如果有一部分運行時間較長的任務,那麼不設置waitForJobsToCompleteOnShutdown屬性,做好冪等或者記錄任務執行的進度比較好,因爲這個時候正在運行任務的線程如果一直不停下來,下一步就是kill -9,這會導致一些無法預知的問題。
- 可以實現可中斷的任務,然後設置shutdown時等待任務完成,就能兼顧多種形式的任務,避免線上出現意外。
- quartz監聽器
無論是job監聽還是trigger監聽,catch所有異常。 - 使用自定義的線程池?
自定義線程池和quartz交互可能有問題,比如自定義線程池等待所有任務完成,去修改db時發現quartz的job store已經停止,淚目... - 劃分界限,調度歸調度,任務歸任務,做好冪等