6

I'm trying to run Chapel in multi-locale mode with Slurm. However, 'make check' fails. Could someone help me with that?

I used Chapel 1.23.0. Here are the actual commands I used:

cd chapel-1.23.0/
export CHPL_HOME=$PWD
source $CHPL_HOME/util/setchplenv.bash
export CHPL_COMM=gasnet
export CHPL_LAUNCHER=slurm-srun
export CHPL_TARGET_CPU=native
make && make check

Here is the error messages I got:

== Actual Test Output (raw, with verbose) ==
srun --job-name=CHPL-hello6-tas --quiet --nodes=4 --ntasks=4 --ntasks-per-node=1 --cpus-per-task=16 --exclusive --mem=0 --kill-on-bad-exit  /home/user1/.chpl/chapel-test-P4CwK/hello6-taskpar-dist_real -nl4 --printLocaleName=false -v
GASNet: Invalid number of nodes: -nl4
GASNet: Usage '/home/user1/.chpl/chapel-test-P4CwK/hello6-taskpar-dist_real <num_nodes> {program arguments}'
Dan Bonachea
  • 2,408
  • 5
  • 16
  • 31
dr.eru
  • 63
  • 2
  • 1
    To find out if the problem is specific to `make check`, could you check if the following command works? `chpl examples/hello6-taskpar-dist.chpl && ./hello6-taskpar-dist -nl 4` – mppf Jan 19 '21 at 12:57
  • I was able to fix the issue but, just FYI, I got the same error with `hello6-taskpar-dist`. Thank you for your response! – dr.eru Jan 22 '21 at 05:33

1 Answers1

3

Assuming you're using the udp substrate with gasnet ($CHPL_HOME/util/printchplenv shows CHPL_COMM_SUBSTRATE: udp) then slurm-srun doesn't work in that particular configuration. The udp substrate requires CHPL_LAUNCHER=amudprun. From https://chapel-lang.org/docs/platforms/udp.html#using-the-udp-conduit-with-slurm, you should be able to do:

export CHPL_LAUNCHER=amudprun
export GASNET_SPAWNFN=C
export GASNET_CSPAWN_CMD="srun -N%N %C"

Note that you'll have to redo the top-level make command.

This tells Chapel to use the amudprun launcher, and then lets amudprun know how to spawn onto this system (in this case using srun instead of defaulting to using ssh)

Elliot
  • 381
  • 1
  • 4
  • Is this something that the `chplenv` scripts or Makefiles could check for and issue an error against to prevent future users from potentially stepping into this hole? – Brad Jan 20 '21 at 01:21
  • 1
    Yeah, that's a possibility. I opened https://github.com/chapel-lang/chapel/issues/16971 to look into that. – Elliot Jan 20 '21 at 01:24
  • Thank you so much for your help. It worked! – dr.eru Jan 22 '21 at 05:25