2

In an Node.js application I need to run a background task every 10s (task in the general sense). Most of the time, this task is a simple polling another webservice, and every now and then something changes in that web service and the background task updates internal state of my application. FYI, this application runs in a Docker container in an Openshift deployment.

Today I realized that the background task had stopped executing. The log files did not show any exception or other suspicious messages. The CPU and memory consumption of the Openshift pod looked normal. Unfortunately I did not have the verbose log level enabled, only info, and my skills with Linux are not good enough to take a process dump and analyze it.

I am wondering if there is something that I am doing wrong with how I implemented this background task. Also, this is the first time I encountered this (AFAIK) and the application has been under development for over a year now, with application instances running sometimes 2 months or longer without issue.

let pollingActive = true;
let pollingInterval = 10000;

const pollingMethod = () => {
    if (!this.pollingActive) {
        return;
    }

    logger.verbose('Start of synchronization cycle.');
    performSync() // this is the polling and internal state update work
    .then(() => {
        this.pollingHandle = setTimeout(() => { pollingMethod(); },
            this.pollingInterval);
    })
    .catch(err => {
        logger.info('Error while synchronizing.', err);
        this.pollingHandle = setTimeout(() => { pollingMethod(); },
            this.pollingInterval);
    });
};

pollingMethod();

I double-checked my code and the only time pollingActive becomes false is when the Node.js application shuts down, which I know it didn't because it was still responding to regular REST API requests.

Any idea why this background task could have stopped working?

Side note: I am sure there are better ways to do background tasks in Node.js or a Docker container for that matter, including having the external web service send change notifications via a queue (which may still need to be polled ;-) However, because this project is on the "way out" I am in no position to make significant changes.

Christoph
  • 2,211
  • 1
  • 16
  • 28
  • "I double-checked my code and the only time _pollingActive_ becomes true is ..." Did you mean "the only time `pollingActive` becomes *false*"? – David Knipe May 23 '18 at 22:37
  • What about `performSync`? Are you sure it calls its success or error handler? – David Knipe May 23 '18 at 22:39
  • @DavidKnipe Thank you for pointing out that typo. Yes, it was supposed to be 'false'. – Christoph May 24 '18 at 09:32
  • I went, again, through the entire performSync logic to check all return paths for proper return of promises. Looks good to me. I'll turn on verbose logging for now, in case this happens again. If it does, the logs will hopefully provide a better clue. – Christoph May 24 '18 at 09:47
  • You could post `performSync` here, in case you've missed something. – David Knipe May 24 '18 at 21:23

1 Answers1

0

This is an interesting one, as currently, it remains a mystery to me what exactly could have happened. We would need more information for this. It did however led me to find some interesting information on what NodeJS says about timers:

The timeout interval that is set cannot be relied upon to execute after that exact number of milliseconds. This is because other executing code that blocks or holds onto the event loop will push the execution of the timeout back. The only guarantee is that the timeout will not execute sooner than the declared timeout interval.

Apparently, they don't give any guarantees as to when the timer will be fired. Personally though, I would advise you to use something the operating system provides, if that's a possibility. Try looking into cron, a system to schedule tasks in the OS (linux, among others). This Stackoverflow question has some good pointers and links to continue reading.

NocNit
  • 280
  • 1
  • 7
  • I wouldn't worry about that paragraph in the docs. I think it should still execute the function eventually. – David Knipe May 23 '18 at 22:36
  • The event loop is not entirely blocked since the application is not entirely blocked since it promptly responds to requests to various REST API controllers. – Christoph May 24 '18 at 09:45