12

I had a bunch of jobs in bull queue when one got stuck for 1+ hours (normally takes ~2 minutes to run), but didn't fail. I was unable to remove the job from the active state with the bull arena UI that I use, so I deleted the key of the active job in Redis.

That removed the stuck active job, but now the queue isn't pulling any jobs off of the waiting list.

Any ideas? Any thoughts on how to fix it?

steveryan
  • 146
  • 1
  • 6
  • i am facing a similar issue, jobs seems to be stuck in waiting state. did you manage to fix this? – opensource-developer May 08 '20 at 09:14
  • 2
    I renamed the queue in my source code and then redeployed. Definitely not an ideal solution but it did get everything unstuck and flowing again. – steveryan May 09 '20 at 16:24
  • thanks @steveryan for your reply. had another question did you implement the process method called as a separate process? https://github.com/OptimalBits/bull#separate-processes. wanted to know how to find if its performing better then the regular appraoch – opensource-developer May 09 '20 at 22:06
  • 2
    @steveryan I am seeing the same issue. Renaming the queue did fix the issue. Have you or anyone come to a root cause for this issue? – Bruce Schardt Jul 14 '21 at 18:35
  • Also happened to me. Random jobs seem to stay in WAITING state and after a reaching max retries, it fails. Renaming the queue fixes it. I think something becomes stale in redis when you manually remove a key and leads to this behaviour. Renaming is the same as creating a new queue from scratch. – Ianthe the Duke of Nukem Aug 05 '21 at 05:27
  • I have the same issue, but in an active state, btw I have 30 concurrent processes, sometimes in the next day total of a stuck is different, but when I retry manually the queue is running properly. is better to rename the queue? – Vzans Jul 29 '22 at 22:53
  • Worth mentioning, in case you have development and production version make sure to make a distinction in the naming of the queue. Otherwise any of the worker might pick up the task. If the codebase are on different version you will get odd results, debugger not stopping. – Tajs Mar 01 '23 at 10:56

2 Answers2

0

i was having the same problem. and then i realized i didn't add connection option for worker. If connection is not added for worker, you can add it as same as queue.

new Worker(
  QUEUE_NAME,
  async job => {
    // ...
  },
  {
    connection: redisConnection,
  },
)
Ahmet Şimşek
  • 1,391
  • 1
  • 14
  • 24
-1

I faced similar issue a moment ago. However, I was able to overcome the challenge by changing the order of my program functions that did specific task such as defining the queue, adding to the queue and defining a process.

Initially I had a flow like this

  1. Create queue
  2. Define the process
  3. Add to the queue

But after facing the issue I changed it to

  1. Create queue
  2. Add to the queue
  3. Define the process