1

I have a dashboard built on plotly dash. The dashboard updates in real-time and it includes a lot of processing of independent files. For example, there are five different time series in the dashboard and they could feasibly update separately and in parallel because they are completely independent from one another.

I am hosting the dashboard locally on a windows machine. Based on the commentary in the dash plotly forum, it sounds like the best way to get parallel processing is by using a job cue like waitress or celery.

What is the best tool to use to take advantage of parallel processing?

These are three options:

Cauder
  • 2,157
  • 4
  • 30
  • 69

1 Answers1

1

TLDR; I would recommend using Celery, here is a small example.

The tools you have listed are used for slightly different purposes,

  • Waitress is a WSGI server, i.e. it can be used to serve the Dash application, or more specifically the underlying Flask server

  • Threading is a library for building threaded programs in Python

  • Celery is a distrubuted task queue

That being said, all of the above tools do provide some functionality related to concurrent processing in Dash,

  • You can enable concurrent execution of callbacks via configuration of you WSGI server (which could be Waitress, though gunicorn is a more popular choice)

  • From within a callback, you can spin of the heavy part of a calculation to seperate thread(s) using the Threading library

  • You can use Celery to do the (async) heavy lifting

The first option will speed up you app, if you have many independent callbacks running (as they will run in parallel). If you instead have few but slow callbacks (i.e. their execution time is more a few of seconds), a better approach would be to do the heavy lifting asynchronously. While both option 2 and 3 enables async processing, Celery comes with a lot of functionality out of the box. Hence, for your usecase, Celery would be my first choice. For reference, here is a small example of how to run an async job in Dash using Celery.

emher
  • 5,634
  • 1
  • 21
  • 32
  • Does celery work on windows? – Cauder Nov 20 '22 at 23:28
  • Recent versions of Celery do not provide official support for Windows, but there are some workarounds available, see e.g. [this thread](https://stackoverflow.com/questions/37255548/how-to-run-celery-on-windows). However, I would recommend using Linux. Either directly, or via WSL2, if you are bound to Windows. – emher Nov 21 '22 at 06:03
  • Is wsl2 like booting up a virtual machine? I'm unfortunately required to use windows by my use case – Cauder Nov 21 '22 at 12:03
  • It's a bit offtopic, so I would recommend treating that in a different question. But yes, Windows Subsystem for Linux (WSL) is similar to a VM in the sense that it enables the use of Linux tools on Windows, but without the overhead associated with a VM. – emher Nov 21 '22 at 18:26
  • Is the idea of celery that I have a job queue? For example, right now if a callback fires while another is in process, then the running callback cancels. If I use celery, will they both run just sequentially. Is that right? – Cauder Nov 22 '22 at 12:54
  • Yes, the callback will not do much work itself; it will just add a job to the queue (managed via Celery). Since the add operation itself is fast, you will typically avoid the "running callback canceled" issue. – emher Nov 23 '22 at 20:47
  • This answer led me to ask a new question, can you take a look at this one? https://stackoverflow.com/questions/74533185/can-i-setup-a-simple-job-queue-with-celery-on-a-plotly-dashboard – Cauder Nov 25 '22 at 13:15