4

I want to run several instances of matlab without running a parfor loop. The structure of my code is the following:

if k == 1
% Set some parameters here
elseif k == 2
% Set some other parameters here 
...
elseif k == 10
%Set some other parameters here
end

Is there an efficient way of opening 10 instances of matlab where each instance will run for a given value of k?

I know that in a cluster with slurm I could use slurm arrays, i.e. I could add this to the beginning of the matlab code:

k = str2num(getenv('SLURM_ARRAY_TASK_ID'));

And then just to a batch submit. Anything similar that I could do on a normal computer?

phdstudent
  • 1,060
  • 20
  • 41
  • 1
    What OS are you using on your normal PC? If it is Linux then here you go; https://www.mathworks.com/help/matlab/ref/matlablinux.html#d123e901472 and here is a link for Windows; https://www.mathworks.com/help//rtw/ug/building-models-from-the-dos-window-command-line.html#:~:text=About%20MATLAB%20Command%2DLine%20(Start%20Up)%20Arguments,-When%20you%20start&text=For%20a%20description%20of%20these,Command%20Prompt%2C%20type%3A%20matlab%20. – Prakhar Sharma Aug 07 '22 at 22:34
  • PC. Still not sure how to build the batch file that will run 10 instances of matlab, each one with a different value for k. – phdstudent Aug 08 '22 at 11:24
  • It is always possible to write a single Matlab script that can do anything. what exactly motivates you to use multiple instances of Matlab? You can't implement `str2num(getenv('SLURM_ARRAY_TASK_ID'));` on a PC. – Prakhar Sharma Aug 08 '22 at 11:36
  • 1
    Why is that not a `parfor k=1:10` loop? – Cris Luengo Aug 14 '22 at 21:49
  • 1
    Running multiple instances of MatLab will not be (more) efficient, but if you have the resources to run scripts in parallel (and probably, non-interactively) it may save you time. If you let each value of `k` write a script and then start MatLab with each script as the input, that should be fine. As @CrisLuengo says though, your question doesn't show why you *shouldn't* use a parfor instead. – alle_meije Aug 16 '22 at 11:24
  • So are you saying that you want to use the Parallel Computing Toolbox in a different way than using `parfor` (eg with tasks), or are you saying you don’t have access to this toolbox at all? – Cris Luengo Aug 16 '22 at 13:26
  • I'll refrain from answering before clarification on the availability of the parallel toolbox and reason to avoid `parfor`. This sounds like the perfect job for [`spmd`](https://mathworks.com/help/parallel-computing/spmd.html), [Single Program Multiple Data](https://mathworks.com/help/parallel-computing/distribute-arrays-and-run-spmd.html), given you want seem to want to run the same program with 10 different parameter sets. – Adriaan Aug 16 '22 at 14:11

3 Answers3

4

In Linux, you could let a bash script write out MATLAB scripts which can then be executed in parallel. You can just use the ampersand (&) for that after each MATLAB call, but the GNU parallel software is better: you can then specify how many jobs will run in parallel.

This bash script

#!/bin/bash

# command line argument: how many scripts (jobs) in parallel?
if [[ ${1} == "" ]]; then
   echo "${0} needs a parameter: N == how many scripts are made 0,1,2 ..."
   exit 1
fi
N=${1}; 
echo "creating and running ${N} scripts ..."

# some constants
c_dir=$(pwd)
ml_ex=$(which matlab)

# create the scripts
for (( i=1; i <= ${N}; i++ )); do
cat << EOF > ${c_dir}/script${i}.m
a = ones (${i}) * $i
EOF
done

# list them, then pass this list to parallel
for f in ${c_dir}/script*.m; do
    echo "${ml_ex} < $f" 
done | parallel -j ${N};

# tidy up
rm -f ${c_dir}/script*.m

makes N MATLAB scripts (N is the command line parameter) and executes them in MATLAB in parallel. Each script should show a MxM matrix filled with the number M (for M = 1,2, ... N ). So the command runsN.sh 5 runs 5 copies of MATLAB at the same time.

Instead of ${ml_ex} in the script, ${ml_ex} -nodesktop -nosplash shows more clearly what happens. I have an alias to always use those options.

This maybe worth trying if you have a number of time-consuming, not very resource-demanding, completely independent jobs. I have used it for image processing.

alle_meije
  • 2,424
  • 1
  • 19
  • 40
3

If you use GNU parallel, you can get a setup similar to using Slurm on a cluster:

parallel -j 4 'export SLURM_ARRAY_TASK_ID={} ; matlab [...] my_script.m' ::: {1..10}
          ^    ^                                ^                             ^
          |    |                                |                             Bashism to express 1, 2, ..., 10
          |    |                                invoke Matlab with its args and the script
          |    create a SLURM_ARRAY_TASK_ID variable to fool the script
          run maximum 4 "jobs" at a time

By setting the SLURM_ARRAY_TASK_ID variable explicitly in the command launched by parallel, you can use the same Matlab script both on the cluster and on your local workstation.

GNU Parallel offers many options to manage, limit, throttle, or even dispatch "jobs".

damienfrancois
  • 52,978
  • 9
  • 96
  • 110
2
  1. You can pass arguments to matlab functions from command line.

e.g.

function y = myfunc(k)
switch k:
    case 1:
....

from DOS:

 matlab /r "myfunc(2)"
  1. You can write for loops in windows command line

    for /l %x in (1, 1, 100) do echo %x

  2. You can run commands "on the background"

    START /B your_command

But I don't understand what you want with this. The overhead of running 10 matlab instances will absolutely undermine any "parallelization" that you think you are achieving. You are running 10 times the engine. Just Observe how much RAM and processing power MATLAB takes when you open it, without running anything.

Note that MATLAB parfor is great at parallelization and that many of the MATLAB functions, if allowed (i.e. if free) will use all available cores to compute the results, maximizing your resources. I'd be surprised if this is anywhere near fast if running in a single PC.

So the above answers how you do it, but the real answer is that you should not do this. Its the worst and slowest way to run multiple instances of a code in MATLAB. A for loop may be faster.

Ander Biguri
  • 35,140
  • 11
  • 74
  • 120