1

I have some Java app using Spring Batch. I've got a table used as a queue which contains information on jobs that were requested by clients (as a client requests for a task to be executed, a row is added to this queue).

In one of my classes a while loop is run until someone deactivates some flag :

    protected void runJobLaunchingLoop() {
        while (!isTerminated()) {
            try {
                if (isActivated()) {
                    QueueEntryDTO queueEntry = dequeueJobEntry();
                    launchJob(queueEntry);
                }
            }
            catch (EmptyQueueException ignored) {}
            catch (Exception exception) {
                logger.error("There was a problem while de-queuing a job ('" + exception.getMessage() + "').");
            }
            finally {
                pauseProcessor();
            }
        }
    }

The pauseProcessor() method calls Thread.sleep(). When I run this app in a Docker container it looks like the number of threads run by the application keep on increasing. The threads have the name "Timer-X" with X some integer that auto-increments.

I looked at the stack trace of one of these :

    "Timer-14" - Thread t@128
   java.lang.Thread.State: WAITING
    at java.base@11.0.6/java.lang.Object.wait(Native Method)
    - waiting on <25e60c31> (a java.util.TaskQueue)
    at java.base@11.0.6/java.lang.Object.wait(Unknown Source)
    at java.base@11.0.6/java.util.TimerThread.mainLoop(Unknown Source)
    - locked <25e60c31> (a java.util.TaskQueue)
    at java.base@11.0.6/java.util.TimerThread.run(Unknown Source)

   Locked ownable synchronizers:
    - None

Any idea what could be the cause of this? I'm not sure but if I don't run the app in a container but locally from IntelliJ, it seems like the problem does not occur. I'm not sure because sometimes it takes a while before thread count starts increasing.


EDIT : Some relevant code ...

protected QueueEntryDTO dequeueJobEntry() {
        Collection<QueueEntryDTO> collection = getQueueService().dequeueEntry();
        if (collection.isEmpty())
            throw new EmptyQueueException();
        return collection.iterator().next();
    }

@Transactional
    public Collection<QueueEntryDTO> dequeueEntry() {
        Optional<QueueEntry> optionalEntry = this.queueEntryDAO.findTopByStatusCode(QueueStatusEnum.WAITING.getStatusCode());
        if (optionalEntry.isPresent()) {
            QueueEntry entry = (QueueEntry)optionalEntry.get();
            QueueEntry updatedEntry = this.saveEntryStatus(entry, QueueStatusEnum.PROCESSING, (String)null);
            return Collections.singleton(this.queueEntryDTOMapper.toDTO(updatedEntry));
        } else {
            return new ArrayList();
        }
    }

private void pauseProcessor() {
        try {
            Long sleepDuration = generalProperties.getQueueProcessingSleepDuration();
            sleepDuration = Objects.requireNonNullElseGet(
                    sleepDuration,
                    () -> Double.valueOf(Math.pow(2.0, getRetries()) * 1000.0).longValue());
            Thread.sleep(sleepDuration);
            if (getRetries() < 4)
                setRetries(getRetries() + 1);
        }
        catch (Exception ignored) {
            logger.warn("Failed to pause job queue processor.");
        }
    }
Bruno
  • 63
  • 7
  • Where's the code which creates the Timers? – tgdavies Jul 27 '21 at 10:35
  • I don't create any, so I'm rather surprised. Not sure how to do the profiling right. – Bruno Jul 27 '21 at 10:37
  • What debugging have you tried to do? Have you put a breakpoint in the constructors of Timer? – tgdavies Jul 27 '21 at 10:38
  • Without seeing how this code is launched or what code is being called, this is impossible to answer. – M. Deinum Jul 27 '21 at 10:38
  • I just put a breakpoint in the Timer constructors, and saw that threads are being created from within `dequeueJobEntry()`. Looks like it happens when making a connection. This is interesting because at some point an error also occurred in the container where there was a connection issue, too many connections or something of the sorts. I'll add the relevant code to my answer. – Bruno Jul 27 '21 at 10:43
  • Ok, added all code that is called from the loop. One method for dequeueing an entry which calls a method that interacts with a DAO to fetch some rows and then extracts the first one. Then lastly some method called at the end of each loop where the thread is paused for 2 seconds with `Thread.sleep()`. – Bruno Jul 27 '21 at 10:47
  • It's IBM DB2 that makes the Timers, and I get a similar problem as [here](https://www.ibm.com/support/pages/apar/IT30972) or [here](https://programmer.help/blogs/using-jvm-on-line-debugging-tool-to-troubleshoot-on-line-problems.html) but I'm running version 11.5 where this bug should have been fixed. Changed some of the relevant parameters such as the one mentioned in the links but still no luck. – Bruno Jul 28 '21 at 14:51

1 Answers1

0

It seems like this was caused by a bug that was resolved in a more recent version of DB2 than I was using.

Applications are getting large number of timer threads when API timerLevelforQueryTimeout value is not set explicitly in an application using JCC driver version 11.5 GA (JCC 4.26.14) or later. This issue is fixed in 11.5 M4 FP0(JCC 4.27.25).

I updated the version to a newer one (11.5.6) in my POM file, but this didn't fix the issue. Turns out my K8s pod was still using 11.5.0 and Maven acted weird. I then applied this technique (using dependencyManagement in the POM file) and the newer version was loaded.

Bruno
  • 63
  • 7