1

I have an existing dash app that uses dash extensions. I also have installed dask (distributed) to make use of the futures. In order to get this set up I need to create a client like so which is in my results.py:

from dask.distributed import Client as DaskClient
dask_client = DaskClient()

My dash app is structure is like so:

dash/
   application.py
   index.py
   results.py
   pathfinder.py
   requirements.txt

In order for the futures to work (using the client) I needed to move my imports of results inside my if main block like so:

index.py:


if __name__ == "__main__":
    from application import app, server  # noqa: F401
    from dash import html
    from results import results

    inner_layout = html.Div(
        [
            results
        ],
        style={"padding": 20},
    )
    app.layout = html.Div(children=[inner_layout])
    app.run_server(debug=False)

Within my application.py the setup is like so:

from dash_extensions.enrich import DashProxy, MultiplexerTransform, ServersideOutputTransform
import dash_bootstrap_components as dbc

# external CSS stylesheets
external_stylesheets = [
    dbc.themes.LUX,  # dash-bootstrap theme
    "https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.2/css/all.min.css",  # font-awesome icons
]
meta_tags = [{"name": "viewport", "content": "width=device-width, initial-scale=1.0"}]
app = DashProxy(
    __name__,
    external_stylesheets=external_stylesheets,
    meta_tags=meta_tags,
    url_base_pathname="/analyzer/",
    title="Proj",
    suppress_callback_exceptions=True,
    transforms=[MultiplexerTransform(), ServersideOutputTransform()]
)
server = app.server

results.py

import dash_bootstrap_components as dbc
from dash import dcc
from dash import html
from dash.dependencies import Input, Output, State
from application import app
from dash.exceptions import PreventUpdate
from pathfinder import find_all_paths
from dash_extensions.enrich import ServersideOutput
from dask.distributed import Client as DaskClient

dask_client = DaskClient()

futures = {}

results = html.Div(
    [
        dbc.Row(
            [
                html.Button(id="analysis-button", children='Take a nap'),
                html.Div(id="results"),
                dcc.Store(id="result-tuple"),
                dcc.Interval(id="poller", max_intervals=0)
            ],
            class_name="text-center",
        ),
    ]
)


@app.callback(
    Output("analysis-button", "disabled"),
    Output("analysis-button", "children"),
    ServersideOutput("result-tuple", "data"),
    Output("poller", "max_intervals"),
    Input("analysis-button", "n_clicks"),
    prevent_initial_call=True,
)
def update_output_data(
        n_clicks: bool
):
    find_all_paths_res = dask_client.submit(find_all_paths, 0, 3)
    futures[find_all_paths_res.key] = find_all_paths_res
    return True, "Napping...", find_all_paths_res.key, -1


@app.callback(
    [
        Output("analysis-button", "disabled"),
        Output("analysis-button", "children"),
        Output("results", "children"),
        Output("poller", "max_intervals"),
    ],
    [Input("poller", "n_intervals")],
    [State("result-tuple", "data")],
    prevent_initial_call=True,
)
def poll_result(n_intervals, data):
    if not data:
        raise PreventUpdate()
    future = futures[data]
    if 'finished' not in future.status:
        raise PreventUpdate()
    content_data = future.result()
    return (
        False,
        "Take a nap",
        content_data[0][1],
        0
    )

pathfinder.py

import networkx as nx
import time


def find_all_paths(inp_source, inp_target):
    G = nx.complete_graph(4)  #  In real program this is a much larger graph with nodes and edges (~36000 rows)
    paths = []
    for path in nx.all_simple_paths(G, source=inp_source, target=inp_target):
        paths.append(path)
    time.sleep(2)
    return paths

requirements.txt

Brotli==1.0.9
click==8.0.4
cloudpickle==2.0.0
dash==2.2.0
dash-bootstrap-components==1.0.3
dash-core-components==2.0.0
dash-extensions==0.0.71
dash-html-components==2.0.0
dash-table==5.0.0
dask==2022.2.1
distributed==2022.2.1
EditorConfig==0.12.3
Flask==2.0.3
Flask-Caching==1.10.1
Flask-Compress==1.11
fsspec==2022.2.0
gunicorn==20.1.0
HeapDict==1.0.1
itsdangerous==2.1.1
Jinja2==3.0.3
jsbeautifier==1.14.0
locket==0.2.1
MarkupSafe==2.1.0
more-itertools==8.12.0
msgpack==1.0.3
networkx==2.7.1
packaging==21.3
partd==1.2.0
plotly==5.6.0
psutil==5.9.0
pyparsing==3.0.7
PyYAML==6.0
six==1.16.0
sortedcontainers==2.4.0
tblib==1.7.0
tenacity==8.0.1
toolz==0.11.2
tornado==6.1
Werkzeug==2.0.3
zict==2.1.0

When I run from the cmd line:

$ python index.py

The code runs and it works fine. However if I try to run it with Gunicorn like so:

$ gunicorn index:server

I get the error message:

Traceback (most recent call last):
  File "/home/myuser/analyzer/env/lib/python3.8/site-packages/flask/app.py", line 2073, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/myuser/analyzer/env/lib/python3.8/site-packages/flask/app.py", line 1511, in full_dispatch_request
    self.try_trigger_before_first_request_functions()
  File "/home/myuser/analyzer/env/lib/python3.8/site-packages/flask/app.py", line 1563, in try_trigger_before_first_request_functions
    self.ensure_sync(func)()
  File "/home/myuser/analyzer/env/lib/python3.8/site-packages/dash_extensions/enrich.py", line 113, in _setup_server
    super()._setup_server()
  File "/home/myuser/analyzer/env/lib/python3.8/site-packages/dash/dash.py", line 1361, in _setup_server
    _validate.validate_layout(self.layout, self._layout_value())
  File "/home/myuser/analyzer/env/lib/python3.8/site-packages/dash_extensions/enrich.py", line 103, in _layout_value
    layout = transform.layout(layout, self._layout_is_function)
  File "/home/myuser/analyzer/env/lib/python3.8/site-packages/dash_extensions/enrich.py", line 215, in layout
    self.transform_layout(layout)
  File "/home/myuser/analyzer/env/lib/python3.8/site-packages/dash_extensions/enrich.py", line 668, in transform_layout
    target.children = _as_list(target.children) + proxies
AttributeError: 'NoneType' object has no attribute 'children'

This issue was not occurring when I did not include the dask.distributed client. Which makes me think that my code inside the if __name__ == "__main__" will only be run when index.py is used as the entry point to the program. But I am still unsure on how to resolve the issue, can someone give some insight as to the logic behind it?

UPDATE: As commented by @Michael Delgado, I have provided a working example of what my program wants to achieve. I poll the result of the future from another callback within dash from the dash distributed future.

After some reading around my understanding is that when I run python index.py the reason it is working is because that if __name__ == '__main__' block will be the entry point, where as if I run it from gunicorn it is importing server (which is app.server) from application and not running the code from the if __name__ == '__main__'. So my program is not actually running. But I am still unsure how to resolve this issue.

mp252
  • 453
  • 1
  • 6
  • 18
  • Does it work with ‘normal’ Dash, i.e. without dash-extensions? – emher Mar 10 '22 at 18:46
  • I need dash-extensions as I am using the multiplexer in my callbacks for example using the same output in two different callbacks. Also I am using the serversideoutput from dash-extensions as well. – mp252 Mar 10 '22 at 23:15
  • 1
    unfortunately, this is likely [too complex a question for stack overflow](https://meta.stackoverflow.com/questions/258521/how-to-ask-complex-questions). you're asking about the intersection of multiple different complex multiprocessing packages. at the very least you'd need to provide a complete [mre]. and when you say you're trying to "make use of futures" - what exactly do you have in mind? are you leveraging a distributed cluster as a backend server engine? or are you just using the dask scheduler as a second task manager within your server? – Michael Delgado Mar 11 '22 at 00:44
  • @MichaelDelgado I have updated with an example, I am just using the dask scheduler as a second task manager within my server, the reason for this was because I was originally using Celery, but Celery did not work with Networkx graph types, also the graph was to large for the broker to deal with. – mp252 Mar 11 '22 at 15:51
  • Yeah - I'm not as familiar with the web server side as I am with dask.distributed, but my hunch is this is not a good usage pattern. As I understand it, the whole point of using gunicorn is efficient multi-process task management. Trying to set up an end-run around this with dask tasks from one of the web server's processes seems like a nightmare to me. I'd try to separate those into different services as much as possible, being really careful to make sure that the scheduler and workers are going to persist. Others might have more info but I don't think this is within the scope of SO. – Michael Delgado Mar 11 '22 at 17:58
  • Frequent pattern for me rn: Wow this feature is awesome, but i wonder why it's not working, oh yeah probably dash_extensions :( – nmu Nov 11 '22 at 12:09

1 Answers1

0

I had a similar issue when combining dash, dash-extensions, flask and gunicorn. I assume, dask isn't a part of the problem.

This should work for you:

from application import app, server  # noqa: F401
from dash import html
from results import results

inner_layout = html.Div(
    [
        results
    ],
    style={"padding": 20},
)
app.layout = html.Div(children=[inner_layout])

if __name__ == "__main__":
    app.run_server(debug=False)

I found out that when gunicorn executes your index.py, the value of __name__ is not __main__. I assume it is the name of the script, instead, so "index" in your case. Thus, the if-condition is not met, thus, your line app.layout = html.Div(children=[inner_layout]) is never reached, thus, app.layout remains None, thus, the error occurs.

The fix is to move everything but the last line in front of the if.

The last line remains in the if, so that it is executed whenever the script is run directly.

Whilst, whenever you run your script with gunicorn, it (or flask?) will already invoke app.run_server(). So the if will prevent it from being called a second time.

Related: https://stackoverflow.com/a/26579510/4445260

ktul
  • 183
  • 9