I am currently running a Django project that offloads work to celery background workers. For CPU-bound tasks the fork pool is working great running like so:
celery -A myApp worker -l INFO -E -n worker --concurrency=<CPU-core-count>
However, with a growing set of long-running tasks mostly scraping APIs from background workers I am looking create a separate queue to make use of eventlet (or gevent pool) for better scaling these workloads.
I am running eventlet it like so which starts the worker and seems fine when I test it.
celery -A myApp worker -l INFO -E -n worker --concurrency=100 -p eventlet
I have tested two different version setups for eventlet. First off with these versions:
- Celery 5.1.0
- Eventlet 0.33
When running this setup I get the same error as reported here when making API polls + database queries in my workers:DatabaseWrapper objects created in a thread can only be used in that same thread
Other suggestions have proposed simply updating eventlet and Celery to the latest version and after I did the workers seems to run well with this setup:
- Celery 5.2.7
- Eventlet 0.33
Lastly, I've tried running gevent like so for comparison:
celery -A myApp worker -l INFO -E -n worker --concurrency=100 -p gevent
with both celery 5.2.7 and 5.1.0 as well as gevent 21.8.0. Seemingly because Django doesn't work well with asynchronous database ORM queries, every once in a while some tasks fail with SynchronousOnlyOperation.
There's no exact algorithm I can follow to reproduce this issue and it happens randomly seemingly when queuing more tasks than the celery concurrency setting is set to and obviously when making Django ORM queries. In general from what I see others propose is to wrap Django ORM queries in async_to_sync, supported in Django < 4.1 or to make use of Django's 4.1 recent update where they've made asynchronous support for most ORM queries. We're not currently ready to update to 4.1 and it would be add a lot of complexity to our tasks if we went with the async_to_sync wrappers.
In general I am fine with using eventlet which is so far have been showing to work in testing given that I use the latest updates for celery and eventlet. But at the same time I am a bit reluctant to use this in production until I understand:
- How eventlet plays with the Django ORM (since the ORM obviously isn't great in asynchronous contexts)
- If eventlet has similar issues with Django's ORM as gevent under heavy concurrency, thus risking to cause SynchronousOnlyOperation exceptions.(I have not found any reports on this happening for eventlet on Stackoverflow or otherwise, only for gevent pool)
- Why the previous issue in eventlet I mentioned above is gone in recent versions and how I can know it can't under certain cirtumstances that I haven't been able to find.
- And lastly if there's a way to verify if database connectors or other Django critical libraries are being monkey patched behind the scenes by using eventlet. A lot of the patching support seems to be magic, at least when using eventlet in Celery so is there a good way to see what's going on under the hood?
Summary of testing:
Gevent celery setup + making Django ORM queries and poll APIs:
celery -A myApp worker -l INFO -E -n worker --concurrency=100 -p gevent
Libraries
- Celery 5.1.0 / Celery 5.2.7
- gevent 21.8.0
Every once in a while throws SynchronousOnlyOperation due to queries run in asynch queries.
Eventlet celery setup + making ORM queries and poll APIs:
celery -A myApp worker -l INFO -E -n worker --concurrency=100 -p eventlet
Libraries:
- Celery 5.1.0
- Eventlet 0.33
Queries fails with: DatabaseWrapper objects created in a thread can only be used in that same thread
celery -A myApp worker -l INFO -E -n worker --concurrency=100 -p eventlet
Libraries:
- Celery 5.2.7
- Eventlet 0.33
Seemingly works, no issues caught so far in testing but no way to know that it works under all circumstances given previous tests with gevent and eventlet