2

I have Spring with Quartz jobs (clustered) running at periodic interval (1 minute). When server starts everything seems fine, but jobs don't get triggered after some time. Restart of the server makes the jobs run, but issue re-occurs after some time.

I suspected it to be a thread exhaustion issue and from thread dump I noticed that all my Quartz threads (10) are in TIMED_WAITING.

Config:

org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount = 10
org.quartz.threadPool.threadPriority = 5

Thread dump:

quartzScheduler_Worker-10 - priority:10 - threadId:0x00007f8ae534d800 - nativeId:0x13c78 - state:TIMED_WAITING stackTrace:
    java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x000000066cd73220> (a java.lang.Object)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568)
        - locked <0x000000066cd73220> (a java.lang.Object)

Using quartz 2.2.1 (i doubt if it could be version specific issue)

I verified from the logs that there are no DB connectivity issues.

Kindly help in diagnosing the problem. Is there a possibility that I have maxed out system resources (number of threads) ? But my jobs are synchronous and exist only when all its child threads have completed their task and I also have this annotation @DisallowConcurrentExecution

walen
  • 7,103
  • 2
  • 37
  • 58
Vinod Jayachandran
  • 3,726
  • 8
  • 51
  • 88
  • Can you explain or show us what does your logic do. Because it looks like you have some memory leaks and your threads never become free after each iteration. – Andriy Rymar Jun 19 '17 at 09:42
  • How could memory leak be related to thread not being free for each iteration ? – Vinod Jayachandran Jun 19 '17 at 09:50
  • Yeah, my bad, I meant resource leak. In general you doing something that not makes thread free after execution. To help you we need see what do you do. Also, quartz has property that makes able new iteration only if previous was finished. You can enable it and see if the issue with functionality that never finish execution. – Andriy Rymar Jun 19 '17 at 09:54
  • Can you point me to that property that makes able new iteration only if previous was finished ? I have already used the spring annotation @DisallowConcurrentExecution – Vinod Jayachandran Jun 19 '17 at 10:01
  • [Here](https://stackoverflow.com/questions/1636556/ensure-that-spring-quartz-job-execution-doesnt-overlap) is explanation about executing overlap. – Andriy Rymar Jun 19 '17 at 11:57

1 Answers1

0

The root cause was we had too many miss fires in our quartz job. We have quartz kicks in every 1 minute and job doesn't really complete in say 1 min, so it's getting pilled up as miss fires and quartz tries to execute them first.

During this process there's an operation of update of miss fires which takes a lots of time which leads quartz to get stuck. This is evident from thread dump where in all our quartz threads are in TIMED_WAITING state as below

quartzScheduler_Worker-10 - priority:10 - threadId:0x00007f8ae534d800 - nativeId:0x13c78 - state:TIMED_WAITING
stackTrace:
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000066cd73220> (a java.lang.Object)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:568)
- locked <0x000000066cd73220> (a java.lang.Object)

Refer : https://jira.terracotta.org/jira/si/jira.issueviews:issue-html/QTZ-357/QTZ-357.html

For our use case miss fires can be ignored and can be picked with next run. Hence I changed the Misfire instruction to ignore as below

<property name="misfireInstructionName" value="MISFIRE_INSTRUCTION_IGNORE_MISFIRE_POLICY" />
Vinod Jayachandran
  • 3,726
  • 8
  • 51
  • 88