Questions tagged [dask-kubernetes]

Questions about using dask-kubernetes to create and run dask distributed clusters

Dask Kubernetes deploys Dask workers on Kubernetes clusters using native Kubernetes APIs. It is designed to dynamically launch short-lived deployments of workers during the lifetime of a Python process.

Full Documentation

54 questions
6
votes
2 answers

Dask Memory leakage issue with json and requests

This is just a sample minimal test to reproduce memory leakage issue in remote Dask kubernetes cluster. def load_geojson(pid): import requests import io r =…
5
votes
1 answer

Dask Gateway, set worker resources

I am trying to set the resources for workers as per the docs here, but on a set up that uses Dask Gateway. Specifically, I'd like to be able to follow the answer to this question, but using Dask Gateway. I haven't been able to find a reference to…
bill_e
  • 930
  • 2
  • 12
  • 24
3
votes
0 answers

How to View Dask Daskboard in Dask Gateway when using a private IP address/VPC?

We deployed Dask Gateway on Kubernetes on Google Cloud Platform. We are currently using an internal TCP load balancer to expose the traefik proxy for security purposes. Our users are able to create a client connection to the cluster generated…
3
votes
0 answers

How to pick proper number of threads, workers, processes for Dask when running in an ephemeral environment as single machine and cluster

Our company is currently leveraging prefect.io for data workflows (ELT, report generation, ML, etc). We have just started adding the ability to do parallel task execution, which is powered by Dask. Our flows are executed using ephemeral AWS Fargate…
braunk
  • 31
  • 2
3
votes
1 answer

How do I get adaptive dask workers to run some code on startup?

I'm creating a dask scheduler using dask-kubernetes and putting it into adaptive mode. from dask-kubernetes import KubeCluster cluster = KubeCluster() cluster.adapt(minimum=0, maximum=40) I need each worker to run some setup code when they are…
Jacob Tomlinson
  • 3,341
  • 2
  • 31
  • 62
2
votes
1 answer

Is it possible to use system memory instead of GPU memory for processing Dask tasks

We have been running DASK clusters on Kubernetes for some time. Up to now, we have been using CPUs for processing and, of course, system memory for storing our Dataframe of around 1,5 TB (per DASK cluster, split onto 960 workers). Now we want to…
honor
  • 7,378
  • 10
  • 48
  • 76
2
votes
1 answer

Transitioning from Local Dask to a Cluster …

I have a simple embarrassingly parallel program that I am successfully running locally on Dask. Yay! Now I want to move it to a cluster and crank up the problem size. In this case, I am using GCP. I have tried it two ways, GCPCluster() and…
adonoho
  • 4,339
  • 1
  • 18
  • 22
2
votes
1 answer

Why does starting daskdev/dask into a Pod fail?

Why does kubectl run dask --image daskdev/dask fail? # starting the container with docker to make sure it basically works ➜ ~ docker run --rm -it --entrypoint bash daskdev/dask:latest (base) root@5b34ce038eb3:/# python Python 3.8.0 (default, Nov 6…
Raffael
  • 19,547
  • 15
  • 82
  • 160
2
votes
1 answer

How do you integrate GPU support with Dask Gateway?

We are currently using Dask Gateway with CPU-only workers. However, down the road when deep learning becomes more widely adopted, we want to transition into adding GPU support for the clusters created through Dask Gateway. I've checked the Dask…
Riley Hun
  • 2,541
  • 5
  • 31
  • 77
2
votes
1 answer

How am I supposed to connect to a Dask-gateway deployed in Kubernetes from an outside service?

I'm a bit confused on how exactly I'm supposed to connect to a deployed Dask cluster created via Dask-helm chart from an external service. I deployed a Dask cluster as explained here After a successful deployment it shows me my pods and services as…
maverick
  • 2,185
  • 2
  • 16
  • 22
2
votes
0 answers

ValueError: Unknown fields ['image']

I am trying to deploy Dask Gateway integrated with JupyterHub which is the reason I decided to give DaskHub Chart a try. After following the instructions on…
2
votes
0 answers

dask-gateway (helm deployment): how to configure a custom authenticator?

I'm currently reading through the Dask Gateway installation docs. I notice that one can specify a custom authenticator, but there is no documentation on what its interface is or how to deploy it. I am running an openid server provided by keycloak. I…
shaunc
  • 5,317
  • 4
  • 43
  • 58
2
votes
0 answers

Worker time-out with Dask Kubernetes on GKE

I'm trying to get dask-kubernetes to work with my GKE account. The maddening thing is that it worked. But now it doesn't. I set up a cluster fine. The nodes get created fine as well. They run for 60 seconds and then time out with the following…
2
votes
1 answer

Using new python environment for DASK workers

I am running my DASK server on hpc where I have all basic necessary modules to run dask and I am loading that module in jupyter notebook. I would like to run some processing task using dask and the modules which are not available in the base…
PUJA
  • 639
  • 1
  • 8
  • 18
2
votes
1 answer

How to configure jupyterlab for dask_kubernetes?

I'm trying to configure jupyterlab dask extension so that the "New" cluster button will create a KubeCluster instead of the default LocalCluster. Tried to edit ~/.config/dask/labextension.yml so it will have this content: kubernetes: …
user3599803
  • 6,435
  • 17
  • 69
  • 130
1
2 3 4