32

I have Celery tasks that are received but will not execute. I am using Python 2.7 and Celery 4.0.2. My message broker is Amazon SQS.

This the output of celery worker:

$ celery worker -A myapp.celeryapp --loglevel=INFO
[tasks]
  . myapp.tasks.trigger_build

[2017-01-12 23:34:25,206: INFO/MainProcess] Connected to sqs://13245:**@localhost//
[2017-01-12 23:34:25,391: INFO/MainProcess] celery@ip-111-11-11-11 ready.
[2017-01-12 23:34:27,700: INFO/MainProcess] Received task: myapp.tasks.trigger_build[b248771c-6dd5-469d-bc53-eaf63c4f6b60]

I have tried adding -Ofair when running celery worker but that did not help. Some other info that might be helpful:

  • Celery always receives 8 tasks, although there are about 100 messages waiting to be picked up.
  • About once in every 4 or 5 times a task actually will run and complete, but then it gets stuck again.
  • This is the result of ps aux. Notice that it is running celery in 3 different processes (not sure why) and one of them has 99.6% CPU utilization, even though it's not completing any tasks or anything.

Processes:

$ ps aux | grep celery
nobody    7034 99.6  1.8 382688 74048 ?        R    05:22  18:19 python2.7 celery worker -A myapp.celeryapp --loglevel=INFO
nobody    7039  0.0  1.3 246672 55664 ?        S    05:22   0:00 python2.7 celery worker -A myapp.celeryapp --loglevel=INFO
nobody    7040  0.0  1.3 246672 55632 ?        S    05:22   0:00 python2.7 celery worker -A myapp.celeryapp --loglevel=INFO

Settings:

CELERY_BROKER_URL = 'sqs://%s:%s@' % (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY.replace('/', '%2F'))
CELERY_BROKER_TRANSPORT = 'sqs'
CELERY_BROKER_TRANSPORT_OPTIONS = {
    'region': 'us-east-1',
    'visibility_timeout': 60 * 30,
    'polling_interval': 0.3,
    'queue_name_prefix': 'myapp-',
}
CELERY_BROKER_HEARTBEAT = 0
CELERY_BROKER_POOL_LIMIT = 1
CELERY_BROKER_CONNECTION_TIMEOUT = 10

CELERY_DEFAULT_QUEUE = 'myapp'
CELERY_QUEUES = (
    Queue('myapp', Exchange('default'), routing_key='default'),
)

CELERY_ALWAYS_EAGER = False
CELERY_ACKS_LATE = True
CELERY_TASK_PUBLISH_RETRY = True
CELERY_DISABLE_RATE_LIMITS = False

CELERY_IGNORE_RESULT = True
CELERY_SEND_TASK_ERROR_EMAILS = False
CELERY_TASK_RESULT_EXPIRES = 600

CELERY_RESULT_BACKEND = 'django-db'
CELERY_TIMEZONE = TIME_ZONE

CELERY_TASK_SERIALIZER = 'json'
CELERY_ACCEPT_CONTENT = ['application/json']

CELERYD_PID_FILE = "/var/celery_%N.pid"
CELERYD_HIJACK_ROOT_LOGGER = False
CELERYD_PREFETCH_MULTIPLIER = 1
CELERYD_MAX_TASKS_PER_CHILD = 1000

Report:

$ celery report -A myapp.celeryapp

software -> celery:4.0.2 (latentcall) kombu:4.0.2 py:2.7.12
            billiard:3.5.0.2 sqs:N/A
platform -> system:Linux arch:64bit, ELF imp:CPython
loader   -> celery.loaders.app.AppLoader
settings -> transport:sqs results:django-db
grantmcconnaughey
  • 10,130
  • 10
  • 37
  • 66

6 Answers6

25

First install eventlet,


> pip install eventlet

and then run


> celery -A myapp.celeryapp worker --loglevel=info -P eventlet

Ibad Shah
  • 439
  • 4
  • 6
  • 2
    Please explain what this does? – alias51 Sep 24 '21 at 08:57
  • 1
    IDK why this worked, but it did. Thanks. The broker sent messages but the worker did nothing. A side-effect of using this solution was that I also started seeing worker logs in my console. IDK if I'll face any issues during production deployment but at least I can work on my app now. – Hussain Jan 03 '22 at 16:42
  • Worked for me. Not sure why, but thanks! – JupiterT Jan 06 '22 at 15:02
  • 5
    @alias51, @Hussain: It worked also in my case. After some research I found [this article](https://www.distributedpython.com/2018/08/21/celery-4-windows/). Long story short: default concurrency pool `prefork` doesn't work on Windows. – myszon Apr 22 '22 at 12:28
  • This also worked for me. Specifically, it showed me that the worker could not find redis and that I had to change the redis broker URL to `redis://127.0.0.1:6379` (instead of `redis://localhost:6379`) in my `settings.py`. – tok3rat0r Jul 08 '22 at 09:52
  • awesome it worked for me, @myszon's point "Long story short: default concurrency pool prefork doesn't work on Windows. " helped me. – asad abbas Jul 11 '22 at 07:11
  • Took me two days solving this trivial thing. Turns out to be windows issue. – SolessChong Apr 23 '23 at 08:36
11

i think you are running celery in windows, try to add following parameter in your cmd:

-P solo

so new parameter will be as:

-A main worker --loglevel=info --queues=your_queue_name -P solo
10

I was also getting same issue. After little bit for searching i found solution to add --without-gossip --without-mingle --without-heartbeat -Ofair to the Celery worker command line. So in your case your worker command should be celery worker -A myapp.celeryapp --loglevel=INFO --without-gossip --without-mingle --without-heartbeat -Ofair

Vishnu
  • 3,899
  • 2
  • 18
  • 19
  • 4
    can you explain why? – weaming Aug 08 '19 at 13:40
  • 1
    @weaming I explained what's (probably) happening in my answer – kontextify Feb 04 '20 at 14:44
  • 13
    This didn't work for me. According to this [post](https://github.com/celery/celery/issues/3759#issuecomment-311763355), I added `-P solo` to the command like: `celery -A proj worker --loglevel=INFO --concurrency 1 -P solo` – Brian Dec 15 '20 at 07:13
  • worked like a charm! Thank you so much, man! – Ankit Brijwasi Feb 26 '21 at 03:52
  • 1
    solo as an execution mode will work.. However, it doesn't execute jobs parallel and it is single-threaded. Please be aware of this in production mode. It just ignores that concurrency option.. – hackwithharsha Jun 18 '21 at 05:25
  • Now that didn't work to me. I am using Windows 11, and I got the error: subprocess.CalledProcessError: Command 'ver' returned non-zero exit status 1. – Lucioric2000 Nov 20 '22 at 00:18
5

Disabling worker gossip (--without-gossip) was enough to solve this for me on Celery 3.1. It looks like a bug causes this inter-worker communication to hang when CELERY_ACKS_LATE is enabled. Tasks are indeed received, but never acknowledged or executed. Stopping the worker returns them to the queue.

From the docs on gossip:

This means that a worker knows what other workers are doing and can detect if they go offline. Currently this is only used for clock synchronization, but there are many possibilities for future additions and you can write extensions that take advantage of this already.

So chances are you aren't using this feature anyways, and what's more it increases the load on your broker.

No time to investigate, but would be good to test this with the latest Celery and open an issue if it still occurs. Even if this behaviour is expected/unavoidable, that should be documented.

kontextify
  • 478
  • 5
  • 16
3

I have the same issue. Vishnu's answer works for me. There is maybe another solution that doesn't require adding these extra parameter to worker command.

My issue is caused by importing other modules in the middle of task code. It seems celery fetch all used modules when you launch the worker and it only looks at the beginning of .py file. During running, it doesn't raise any error and just quit. After I move all "import" and "from ... import ..." to the beginning of code file, it works.

-1

celery -A core worker --loglevel=INFO --without-gossip --without-mingle --without-heartbeat -Ofair --pool=solo