Highest Voted 'dask-jobqueue' Questions

5

votes

0 answers

Dask distributed KeyError

I am trying to learn Dask using a small example. Basically I read in a file and calculate row means. from dask_jobqueue import SLURMCluster cluster = SLURMCluster(cores=4, memory='24 GB') cluster.scale(4) from dask.distributed import Client client…

asked Apr 09 '21 at 21:20

Phoenix Mu

648
7
12

4

votes

1 answer

Is there a way of using dask jobqueue over ssh

Dask jobqueue seems to be a very nice solution for distributing jobs to PBS/Slurm managed clusters. However, if I'm understanding its use correctly, you must create instance of "PBSCluster/SLURMCluster" on head/login node. Then you can on the same…

ssh cluster-computing dask dask-distributed dask-jobqueue

asked Dec 22 '20 at 00:13

Phil Reinhold

141
1

2

votes

1 answer

Difference between dask node and compute node for slurm configuration

First off, apologies if I use confusing or incorrect terminology, I am still learning. I am trying to set up configuration for a Slurm-enabled adaptive cluster. Documentation of the supercomputer and it’s Slurm configuration is documented here. Here…

slurm dask-distributed dask-jobqueue

asked Jun 21 '22 at 21:21

pgierz

674
3
7
14

1

vote

1 answer

Does Dask LocalCluster Shutdown when kernel restarts

If I restart my jupyter kernel will any existing LocalCluster shutdown or will the dask worker processes keep running? I know when I used a SLURM Cluster the processes keep running if I restart my kernel without calling cluster.close() and I have to…

dask dask-distributed dask-jobqueue

asked May 10 '22 at 00:53

HashBr0wn

387
1
11

1

vote

0 answers

Logging in Dask

I am using a SLURM cluster and want to be able to be able to add custom logs inside my task that should appear in the logs on the dashboard when inspecting a particular worker. Alternatively I would like to be able to extract the name of the worker…

dask slurm dask-distributed dask-jobqueue

asked May 04 '22 at 14:13

HashBr0wn

387
1
11

1

vote

1 answer

Reconfigure Dask jobqueue on the fly

I have a jobqueue configuration for Slurm which looks something like: cluster = SLURMCluster(cores=20, processes=2, memory='62GB', walltime='12:00:00', …

dask dask-distributed dask-jobqueue

asked May 18 '21 at 06:00

Albatross

955
1
7
13

1

vote

1 answer

Dask Jobqueue - Why does using processes result in cancelled jobs?

Main issue I'm using Dask Jobequeue on a Slurm supercomputer. My workload includes a mix of threaded (i.e. numpy) and python workloads, so I think a balance of threads and processes would be best for my deployment (which is the default behaviour).…

python dask slurm dask-distributed dask-jobqueue

asked Apr 30 '21 at 00:02

Albatross

955
1
7
13

1

vote

1 answer

Dask jobqueue job killed due to permission

I'm trying to use Dask job-queue on our HPC system. And this is the code I'm using: from dask_jobqueue import SLURMCluster cluster = SLURMCluster(cores=2, memory='20GB', processes=1, log_directory='logs', …

python dask dask-distributed dask-jobqueue

asked Apr 07 '21 at 19:50

Phoenix Mu

648
7
12

0

votes

0 answers

Worker log file gets mangled when log_directory is set

I used to have the worker logs as such: ./slurm-.out ... So I wanted to have SLURMCluster writes the worker logs in a separate directory (as opposed to the current working dir), so I provided "log_directory" as an input argument as such. from…

python-3.x dask slurm dask-distributed dask-jobqueue

asked Mar 31 '23 at 20:52

michaelgbj

290
1
10

0

votes

0 answers

How DASK_jobqueue SLURMCluster can access local python module in parent directory

I'm using dask_jobqueue to establish a SLURMCluster. I'm trying to pass python files in the parent directory to the workers. I tried different ways including sys.path.append, setting PYTHONPATH in my .bashrc file, and setting PYTHONPATH in env_extra…

performance slurm pythonpath dask-jobqueue

asked Feb 10 '23 at 23:33

shambakey1

37
7

0

votes

1 answer

Job, Worker, and Task in dask_jobqueue

I am using a SLURM cluster with Dask and don't quite understand the configuration part. The documentation talks of jobs and workers and even has a section on the difference: In dask-distributed, a Worker is a Python object and node in a dask…

python dask slurm dask-distributed dask-jobqueue

asked May 04 '22 at 03:39

HashBr0wn

387
1
11

0

votes

1 answer

How to change dask job_name to SGECluster

I am using dask_jobqueue.SGECluster() and when I submit jobs to the grid they are all listed as dask-worker. I want to have different names for each submitted job. Here is one example: futures = [] for i in range(1,10): res =…

dask slurm dask-distributed sungridengine dask-jobqueue

asked Jan 13 '22 at 10:12

IvanV

1
1

0

votes

2 answers

Dask workers get stuck in SLURM queue and won't start until the master hits the walltime

Lately, I've been trying to do some machine learning work with Dask on an HPC cluster which uses the SLURM scheduler. Importantly, on this cluster SLURM is configured to have a hard wall-time limit of 24h per job. Initially, I ran my code with a…

dask slurm dask-jobqueue

asked Sep 13 '21 at 11:16

Marta Moreno

1
2

0

votes

1 answer

How to speed up launching workers when the number of workers is large?

Currently, I use dask_jobsqueue to parallelize my code, and I have difficulty setting up a cluster quickly when the number of workers is large. When I scale up the number of workers (say more than 2000), it takes more than 15 mins for the cluster to…

python dask dask-distributed dask-jobqueue

asked Jul 09 '21 at 17:17

Yuki

1

0

votes

1 answer

Dask: Would storage network speed cause a worker to die

I am running a process that writes large files across the storage network. I can run the process using a simple loop and I get no failures. I can run using distributed and jobqueue during off peak hours and no workers fail. However when I run the…

dask dask-distributed dask-jobqueue

asked Feb 19 '21 at 12:31

schierkolk

29
4

Questions tagged [dask-jobqueue]