2

I'm a bit confused on how exactly I'm supposed to connect to a deployed Dask cluster created via Dask-helm chart from an external service. I deployed a Dask cluster as explained here

After a successful deployment it shows me my pods and services as follow:

hub-574f85779c-sk7ct                                 1/1     Running   0          60m
jupyter-dask                                         1/1     Running   0          32m
proxy-68bcf94bd5-22kj5                               1/1     Running   0          60m
traefik-dhub-dask-gateway-6468d9cbff-9bclk           1/1     Running   0          60m
user-scheduler-c787cdb9f-924r2                       1/1     Running   0          60m
user-scheduler-c787cdb9f-l4jwv                       1/1     Running   0          60m

But notice that my services shows my traefic as a ClusterIP with no public ip associated to it.

proxy-public                            LoadBalancer   10.3.245.29    104.zzz.yyy.xxx   80:30447/TCP                 66m
traefik-dhub-dask-gateway               ClusterIP      10.3.241.129   <none>            80/TCP                       66m

I am able to connect to the provided Jupyter Notebook which is created as part of the Helm Chart, using code similar to the one below and use my cluster with no problems at all.

from dask_gateway import GatewayCluster
cluster = GatewayCluster()
client = cluster.get_client()
cluster.scale(2)
# then I can use this client to submit to the cluster
f = client.submit(inc, 7.2)

Now I want to connect from another application that runs outside my k8s cluster. Based on this documentation I am supposed to pass address and auth to the Gateway, I've been trying the following:

import os
os.environ["JUPYTERHUB_API_TOKEN"] = "f206000844a60da8b48e40f6c91c3f2axxxxxxxxx"
from dask_gateway import Gateway
gateway = Gateway(
     "http://104.zzz.yyy.xxx/services/dask-gateway", 
    auth="jupyterhub"
)
gateway.list_clusters()

But always returns as a 401, What is the correct way of connecting to this cluster from outside?

Also, these services are just plain open to the public internet. What are the best practices when it comes to secure these? (Please don't tell me to use Kerberos :( )

ClientResponseError: 401, message='Unauthorized', url=URL('http://104.zzz.yyy.xxx/services/dask-gateway/api/v1/clusters/')
maverick
  • 2,185
  • 2
  • 16
  • 22
  • I've also tried modifying `traefic` service to be exposed as a `LoadBalancer` and then using that as `proxy_address="http://34.70.yyy.xxx"` but got the same `401` result – maverick Dec 01 '20 at 21:45
  • did you try putting the gateway address like this: `"http://34.70.yyy.xxx", auth="jupyterhub"` instead of with the whole path thing a the end `"http://104.zzz.yyy.xxx/services/dask-gateway", auth="jupyterhub"` – Saikat Chakrabortty Dec 02 '20 at 05:13
  • @SaikatChakrabortty yes, if I don't include that then it returns with a 404 `ValueError: {"status": 404, "message": "Not Found"}`. It's my understanding that Jupiter exposes this as a k8s service under `/services/dask-gateway` – maverick Dec 02 '20 at 17:58
  • did you try reinstalling the same with this configuration? https://github.com/dask/dask-gateway/blob/master/resources/helm/dask-gateway/values.yaml#L47 add the API token in here and install may be? ref: https://gateway.dask.org/install-kube.html#authenticating-with-jupyterhub – Saikat Chakrabortty Dec 03 '20 at 01:57

1 Answers1

2

I posted this same issue on dask-gateway GitHub as an issue and got the following response that solved my issue

https://github.com/dask/dask-gateway/issues/356#issuecomment-737504145

In summary, use this installation process.

maverick
  • 2,185
  • 2
  • 16
  • 22