8

After upgrading from celery 2.4.5 i have started having celery randomly shutdown.
I am using celery 3.0.12, boto 2.6 and amazon sqs and django 1.2.7 all this on a centOS machine (pip freeze dump at the bottom)

i am running

service celerybeat start
service celeryd start

A few seconds after i start celery it stops(shutdown) and if i look into one of the celery logs i always see this:

[2012-12-31 10:13:40,275: INFO/MainProcess] Task patrol.tasks.test[270f1558-bcc2-441b-8961 e1f21a2dbd27] succeeded in 0.318082094193s: None
[2012-12-31 10:13:40,424: INFO/MainProcess] child process calling self.run()
[2012-12-31 10:13:40,428: INFO/MainProcess] Got task from broker: patrol.tasks.myTask[d9a5ab26-71ca-448b-a4da-40315570f219]
[2012-12-31 10:13:40,666: INFO/MainProcess] Got task from broker: tasks.test[99edb7e2-caff-4892-a95b-c18a9d7f5c51]
[2012-12-31 10:13:41,114: WARNING/MainProcess] Restoring 2 unacknowledged message(s).
[2012-12-31 10:13:41,115: WARNING/MainProcess] UNABLE TO RESTORE 2 MESSAGES: (TypeError('<boto.sqs.message.Message instance at 0x3269758> is not JSON serializable',), TypeError('<boto.sqs.message.Message instance at 0x32697e8> is not JSON serializable',))
[2012-12-31 10:13:41,116: WARNING/MainProcess] EMERGENCY DUMP STATE TO FILE -> /tmp/tmppO4Bbp <-
[2012-12-31 10:13:41,116: WARNING/MainProcess] Cannot pickle state: TypeError('a class that defines __slots__ without defining __getstate__ cannot be pickled',). Fallback to pformat.

I use use low values for maxtaskperchild to recreate the shutdown fast. if i give a higher value then it takes longer before shutdown occur.

EDIT

While trying to isolate the problem i removed all periodic tasks. and now i have only one periodic task and one task which basicly do nothing and still i can reproduce the bug every time.

@task
def myTask():
    print 1
    return

class test(PeriodicTask):
    run_every = datetime.timedelta(seconds=3)
    def run(self, **kwargs):
            myTask.delay()
            print '2'

my /init.d/celeryd

celeryd

my /default/celeryd

# Name of nodes to start, here we have a single node
# or we could have three nodes:
CELERYD_NODES="w1 w2"

CELERYD_LOG_LEVEL="INFO"

# Where to chdir at start.
CELERYD_CHDIR="/var/myproject"

# How to call "manage.py celeryd_multi"
CELERYD_MULTI="python $CELERYD_CHDIR/manage.py celeryd_multi"

# How to call "manage.py celeryctl"
CELERYCTL="python $CELERYD_CHDIR/manage.py celeryctl"

MAXTASKPERCHILD=2 # this is low on purpose to recreate the shutdown fast
CELERY_CONC=5
EXPRESS_CONC=2
# Extra arguments to celeryd
CELERYD_OPTS="-Q:w1 celery,backup -c:w1 $CELERY_CONC -Q:w2 express -c:w2 $EXPRESS_CONC --time-limit=3600 --maxtasksperchild=$MAXTASKPERCHILD -E"

# Name of the celery config module.
CELERY_CONFIG_MODULE="celeryconfig"

# %n will be replaced with the nodename.
CELERYD_LOG_FILE="/var/log/celeryd/%n.log"
CELERYD_PID_FILE="/var/run/celeryd/%n.pid"

# Name of the projects settings module.
export DJANGO_SETTINGS_MODULE="settings"

# Path to celerybeat
CELERYBEAT="python $CELERYD_CHDIR/manage.py celerybeat"

# Extra arguments to celerybeat.  This is a file that will get
# created for scheduled tasks.  It's generated automatically
# when Celerybeat starts.
CELERYBEAT_OPTS="--schedule=/var/run/celerybeat-schedule"

# Log level. Can be one of DEBUG, INFO, WARNING, ERROR or CRITICAL.
CELERYBEAT_LOG_LEVEL="INFO"

# Log file locations
CELERYBEAT_LOGFILE="/var/log/celeryd/celerybeat.log"
CELERYBEAT_PIDFILE="/var/run/celeryd/celerybeat.pid"

my pip freeze

Django==1.2.7
M2Crypto==0.20.2
MySQL-python==1.2.3c1
amqp==1.0.6
amqplib==1.0.2
anyjson==0.3.3
billiard==2.7.3.19
boto==2.1.1
celery==3.0.12
certifi==0.0.6
distribute==0.6.10
django-celery==3.0.11
django-kombu==0.9.4
django-picklefield==0.3.0
ghettoq==0.4.5
importlib==1.0.2
iniparse==0.3.1
ipython==0.12
kombu==2.5.4
lxml==2.3.4
mixpanel-celery==0.5.0
netaddr==0.7.6
numpy==1.6.2
odict==1.4.4
ordereddict==1.1
pycrypto==2.6
pycurl==7.19.0
pygooglechart==0.3.0
pygpgme==0.1
python-dateutil==1.5
python-memcached==1.48
pytz==2012h
requests==0.9.0
six==1.2.0
urlgrabber==3.9.1
yum-metadata-parser==1.1.2
yossi
  • 12,945
  • 28
  • 84
  • 110

3 Answers3

1

I suppose you have application which puts to your queue something not in json format. Text message for example. You can try to change queue in celery settings and see how it will work

Rustem
  • 2,884
  • 1
  • 17
  • 32
  • putting the message happen after it already shutting down, even if i make it not requeue the message it still shutdown. – yossi Dec 27 '12 at 08:00
  • 1
    Celery fistly reads tasks as messages from rabbit-queue, you see exception with unexpected tasks format. Check your queue entries in rabbitmq or in sqs – Rustem Dec 27 '12 at 09:19
  • the josn error happen while trying to shutdown. the code is in the close method in kombu here https://github.com/celery/kombu/blob/master/kombu/transport/virtual/__init__.py – yossi Dec 27 '12 at 09:26
0

Upgrade Kombu to any version after 2.5.4, because it was a bug in Kombu. It should be fixed in any release of Celery >= 3.0.20 (that depends on kombu>=2.5.5).

See http://kombu.readthedocs.org/en/latest/changelog.html#version-2-5-5

Hugo Lopes Tavares
  • 28,528
  • 5
  • 47
  • 45
0

In my case I was using the wrong version of python python3.8, however the I needed an older version.

So I updated the version:

pyenv install 3.6.7 
pyenv local 3.6.7
pyenv local
virtualenv -p python3.6 venv 
source venv/bin/activate
pip install -r requirements.txt
jmunsch
  • 22,771
  • 11
  • 93
  • 114