26

I am running a python method that parses a lot of data. Since it is time intensive, I would like to run it asynchronously on a separate thread so the user can still access the website/UI.

Do threads using the "from threading import thread" module terminate if a user exits the site or do they continue to run on the server?

What would be the advantages of using Celery versus simply using the threading module for something like this?

ninajay
  • 527
  • 1
  • 5
  • 10
  • 3
    Well, it turns out that this question is **not** generating answers based on just opinions. Instead, it generates answers based on **facts and references**. It is good have some people watching what's going on in the community, but you people update your actions as necessary! – Vassilis Jul 23 '17 at 10:45

1 Answers1

25

The Python interpreter is resolutely single threaded due to the (in)famous global interpreter lock aka (GIL). Thus threading in Python only provides parallelism when computation and IO can happen simultaneously. Compute bound tasks will see little benefit from threading in the Python threading model, at least under CPython 2 or 3.

On the other hand, those limitations don't apply to multiprocessing, which is what you'll be doing with a queuing system like celery. You can run several worker instances of Python which can execute simultaneously on multicore machines, or multiple machines for that matter.

If I understand your scenario - an interaction on a web site kicks off a long-running job - queuing is almost certainly the way to go. You get true parallelism and the easy option to move processing to other machines.

cbare
  • 12,060
  • 8
  • 56
  • 63
  • Could you please let me know what do you mean by `compute bound threads` ? Also did you mean, it does not add value to execute code which do computation or require I/O through threads ? – Venkat Kotra Mar 16 '16 at 13:40
  • @Venkat, by "compute bound" I mean a process whose limiting factor is doing operations in the CPU, as opposed to I/O bound where the limiting factor is reading and writing to disk or network while CPU has cycles to spare. See http://stackoverflow.com/questions/868568/what-do-the-terms-cpu-bound-and-i-o-bound-mean. – cbare Mar 16 '16 at 16:58
  • If it is simply to not block the thread which has to respond back to the client, then is it reasonable to use the threading module?? – user3245268 Apr 04 '19 at 18:00