1

I am working on a django application which uses celery for the distributed async processes. Now I have been tasked with integrating a process which was originally written with concurrent.futures in the code. So my question is, can this job with the concurrent futures processing work inside the celery task queue. Would it cause any problems ? If so what would be the best way to go forward. The concurrent process which was written earlier is resource intensive as it is able to avoid the GIL. Also, its very fast due to it. Not only that the process uses concurrent.futures.ProcessPoolExecutor and inside it another few (<5) concurrent.futures.ThreadPoolExecutor jobs.

So now the real question is should we extract all the core functions of the process and re-write them by breaking them as celery app tasks or just keep the original code and run it as one big piece of code within the celery queue.

As per the design of the system, a user of the system can submit several such celery tasks which would contain the concurrent futures code.

Any help will be appreciated.

Tragaknight
  • 413
  • 4
  • 10

1 Answers1

0

Your library should work without modification. There's no harm in having threaded code running within Celery, unless you are mixing in gevent with non-gevent compatible code for example.

Reasons to break the code up would be for resource management (reduce memory/CPU overhead). With threading, the thing you want to monitor is CPU load. Once your concurrency causes enough load (e.g. threads doing CPU intensive work), the OS will start swapping between threads, and your processing gets slower, not faster.

Nino Walker
  • 2,742
  • 22
  • 30
  • 1
    Thanks, however during the integration I have found that although ThreadPoolExecutor workers are normally executed within the celery context, ProcessPoolExecutor workers in celery give an error - AssertionError: daemonic processes are not allowed to have children. I do understand why this is happening based on this post - https://stackoverflow.com/questions/6974695/python-process-pool-non-daemonic , however I would like to know the pitfall of making them non-daemonic and if there arnt any then how to make the jobs non-daemonic in django celery context. – Tragaknight Jan 29 '19 at 06:08