22

My flask app is comprised of four containers: web app, postgres, rabbitMQ and Celery. Since I have celery tasks that run periodically, I am using celery beat. I've configured my docker-compose file like this:

version: '2'
services:
  rabbit:
    # ...      
  web:
    # ...
  rabbit:
    # ...
  celery:
    build:
        context: .
        dockerfile: Dockerfile.celery

And my Dockerfile.celery looks like this:

# ...code up here...
CMD ["celery", "-A", "app.tasks.celery", "worker", "-B", "-l", "INFO"]

While I read in the docs that I shouldn't go to production with the -B option, I hastily added it anyway (and forgot about changing it) and quickly learned that my scheduled tasks were running multiple times. For those interested, if you do a ps aux | grep celery from within your celery container, you'll see multiple celery + beat processes running (but there should only be one beat process and however many worker processes). I wasn't sure from the docs why you shouldn't run -B in production but now I know.

So then I changed my Dockerfile.celery to:

# ...code up here...
CMD ["celery", "-A", "app.tasks.celery", "worker", "-l", "INFO"]
CMD ["celery", "-A", "app.tasks.celery", "beat", "-l", "INFO"]

No when I start my app, the worker processes start but beat does not. When I flip those commands around so that beat is called first, then beat starts but the worker processes do not. So my question is: how do I run celery worker + beat together in my container? I have combed through many articles/docs but I'm still unable to figure this out.

EDITED

I changed my Dockerfile.celery to the following:

ENTRYPOINT [ "/bin/sh" ]
CMD [ "./docker.celery.sh" ]    

And my docker.celery.sh file looks like this:

#!/bin/sh -ex
celery -A app.tasks.celery beat -l debug &
celery -A app.tasks.celery worker -l info &

However, I'm receiving the error celery_1 exited with code 0

Edit #2

I added the following blocking command to the end of my docker.celery.sh file and all was fixed:

tail -f /dev/null
hugo
  • 1,175
  • 1
  • 11
  • 25
  • 4
    you can create another container just for beat and override its worker command... when you scale your workers you will be scaling just them and not the beat (scheduler) too – Mazel Tov Apr 06 '18 at 13:15
  • 1
    @MazelTov - good suggestion and for my next project I'll consider putting these in separate containers. For various reasons, I needed both of these processes to run in the same container. – hugo Apr 06 '18 at 15:45

4 Answers4

11

docker run only one CMD, so only the first CMD get executed, the work around is to create a bash script that execute both worker and beat and use the docker CMD to execute this script

shahaf
  • 4,750
  • 2
  • 29
  • 32
  • 2
    thanks for the answer but, unfortunately, it's not working for me. Using your suggestion, I encounter the same issue: that is, in my Dockerfile.celery I'm running ENTRYPOINT [ "/bin/sh" ] followed by CMD [ "./docker.celery.sh" ]. The shell script has the two commands: `celery -A app.tasks.celery beat -l debug` followed by `celery -A app.tasks.celery worker -l info`. Just as before, the first command executes but the second does not. Am I doing something wrong? – hugo Apr 06 '18 at 13:54
  • does the first line ends with `&` meaning execute in background and continue to the next line (also the second line should end with `&` for best practice use) – shahaf Apr 06 '18 at 13:58
  • I've added & to the end of each command but upon starting up my container I get the following error `celery_1 exited with code 0`. I've added my shell script to my original post. Any other suggestions? – hugo Apr 06 '18 at 14:14
  • it seems that the container is running all services and than exists, try to add a blocking cmd at the end of the bash script, something like `tailf` some file or something – shahaf Apr 06 '18 at 14:20
  • you're suggestions have been helpful. Everything works now! One tiny cleanup issue remains, that is: I did a `tail -f blank.log` where the blank.log file only contains a comment and nothing else. When the command runs, my console complains with `tail: unrecognized file system type 0x794c7630 for ‘blank.log'`. It's not a big deal since everything works but it'd be nice to address the error. Any suggestions? – hugo Apr 06 '18 at 15:39
  • 1
    FYI: I fixed the last error I was getting with `tail -f /dev/null`. – hugo Apr 06 '18 at 16:52
  • i think this approach work but i would not recommend it. you are losing output of celery beat and you will not know what happens inside container because throwing log out – Mazel Tov Apr 06 '18 at 17:20
  • @MazelTov is right, better practice is to append the celery output to some logfile (best practice to make this logfile persistence on the host) – shahaf Apr 06 '18 at 17:24
  • everything that prints to STDOUT in docker container is a log :-), i think you are better with the `-B` option when running the worker – Mazel Tov Apr 06 '18 at 17:27
  • The other option is to use something like dumb-init. – Divick Feb 21 '19 at 11:44
  • @hugo I needed to do the same thing as you, and your last Edit solved it quickly for me. `celery -A app.tasks.celery beat -l debug &` `celery -A app.tasks.celery worker -l info &` `tail -f /dev/null` Maybe it should be added as an answer. – Carl Kroeger Ihl Jul 31 '19 at 17:50
1

I got by putting in the entrypoint as explained above, plus I added the &> to have the output in a log file.

my entrypoint.sh

#!/bin/bash
python3 manage.py migrate

python3 manage.py migrate catalog --database=catalog

python manage.py collectstatic --clear --noinput --verbosity 0


# Start Celery Workers
celery worker --workdir /app --app dri -l info &> /log/celery.log  &

# Start Celery Beat
celery worker --workdir /app --app dri -l info --beat &> /log/celery_beat.log  &

python3 manage.py runserver 0.0.0.0:8000

1

Starting from the same concept @shahaf has highlighted I solved starting from this other solution using bash -c in this way:

command: bash -c "celery -A app.tasks.celery beat & celery -A app.tasks.celery worker --loglevel=debug"
manuele
  • 359
  • 1
  • 2
  • 16
0

You can use celery beatX for beat. It is allowed (and recommended) to have multiple beatX instances. They use locks to synchronize.

Cannot say if it is production-ready, but it works for me like a charm (with -B key)

baldr
  • 2,891
  • 11
  • 43
  • 61
  • Celery beatX looks interesting. However, it currently seems to only work with Redis and Memcached. I'm using RabbitMQ and would prefer not to add more technology to my stack. But it's good to know about this so thanks for the suggestion. – hugo Apr 06 '18 at 15:44