0

Whats the best way to debug delayed job not restarting?

running restart shows it is restarting the processing, but then grepping shows no processes.

$ RAILS_ENV=production ruby script/delayed_job -n3 --pid-dir=/dem/pids/ restart
Warning: no instances running. Starting...
Warning: no instances running. Starting...
Warning: no instances running. Starting...

$ ps -aux | grep delay
produser    3471  0.0  0.0   7232   612 pts/4    S+   10:28   0:00 tail -f delayed_job.log
produser    4059  0.0  0.0  11740   928 pts/0    S+   10:32   0:00 grep --color=auto delay

$ RAILS_ENV=production ruby script/delayed_job -n3 --pid-dir=/dem/pids/ restart
Warning: no instances running. Starting...
Warning: no instances running. Starting...
Warning: no instances running. Starting...

while tailing the delayed_job.log file, it simply shows processing restarting and silently failing.

2017-02-01T10:48:04-0800: [Worker(delayed_job.0 host:app pid:6257)] worker started
2017-02-01T10:48:04-0800: [Worker(delayed_job.1 host:app pid:6267)] worker started
Blair Anderson
  • 19,463
  • 8
  • 77
  • 114

1 Answers1

0

I found the answer to my own question. logging here in case someone finds in a search.

If you're finding yourself in the same situation, try running the process in the foreground(not the background)

RAILS_ENV=production ruby script/delayed_job --pid-dir=/dem/pids/ run

the output was:

found unexpected end of stream while scanning a quoted scalar at line 165 column 14

googling included a result from the dj source: Malformed yaml in handler could crash all delayed_job workers


This was a great lead.

  • if your handler/data is malformed then DJ will silently fail.
  • if you process emails in DJ then you're likely to have some massive handler columns, because if someone forwards an email the thread would be included. yikes.

What did I do? I queried delayed_jobs for our email handler.

What was the result? 1 job with a freaking massive handler.

could also probably just query for the longest column.

After deleting the job and running the run command, it started processing as normal.

Community
  • 1
  • 1
Blair Anderson
  • 19,463
  • 8
  • 77
  • 114