I'm trying to use Dask job-queue on our HPC system. And this is the code I'm using:
from dask_jobqueue import SLURMCluster
cluster = SLURMCluster(cores=2, memory='20GB', processes=1,
log_directory='logs',
death_timeout=6000, walltime='8:00:00',
shebang='#!/usr/bin/ bash')
cluster.scale(5)
from dask.distributed import Client
client = Client(cluster)
After executing the code, I can use squeue
to check for submitted jobs, and I can see 5 of them in running R
state. But the job are killed after several second. In the .err
files, I found this message:
slurmstepd-midway2-0354: error: execve(): /tmp/slurmd/job10469239/slurm_script: Permission denied
I'm very new to Dask and am not sure what's wrong. Any idea would be appreciated! Thanks!