0

So essentially this is a question about how to use your multi-core processor more efficiently.

I have an optimization script (written in matlab) that would call 20 instances of matlab to evaluate functions. The results will be saved as .mat file and then the optimization script would take these results and do some other work. The way I call 20 matlab instances is first using matlab built-in function "system" to call a batch file, which would then open 20 instances of matlab to evaluate the functions.

The code I'm using in batch file is:

(   start matlab -nosplash -nodesktop -minimize -r "Worker01;exit"
ping -n 5 127.0.0.1 >nul    

start matlab -nosplash -nodesktop -minimize -r "Worker02;exit"
ping -n 5 127.0.0.1 >nul

...... % repeat the pattern 

    start matlab -nosplash -nodesktop -minimize -r "Worker19;exit"
ping -n 5 127.0.0.1 >nul 

start matlab -nosplash -nodesktop -minimize -r "Worker20;exit"
ping -n 5 127.0.0.1 >nul )  | set /P "="

All "start" commands are included in a parenthesis following by command

"| set/P"="" 

because I want my optimization script move on after all 20 evaluations done. I learnt this technique from my another question but I don't really understand what it really does. If you can also explain this I would be very appreciated.

Anyway, this is a way to achieve parallel computing under matlab 2007 which doesn't have original parallel computing feature. However, I found that it's not an efficient way to run 20 instances at the same time because after opening like 12 instances, my cpu (a xeon server cpu, 14 cores available) usage reach 100%. My theory is that opening more instance than cpu could handle would make processor less efficient. So I think the best strategy would be like this:

  1. start the first 12 instances;
  2. start next one(s) on the list once any of current running instance finishes. (Even though workers are opened at roughly the same time and do the same job, they still tend to finish at different times.)

This would make sure that computing power is fully utilized (cpu usage always 100%) all the time without overstressing the cpu.

How could I achieve this strategy in batch file? If batch file is hard to achieve this, could powershell do this?

Please show the actual code and explain. I'm not a programmer so I don't know much of the coding.

Thanks.

Siyu Jiang
  • 11
  • 6
  • CPU usage % is not always indicative of efficiency or hardware utilization. It may also depend on memory and I/O depending on your task. Probably the best way to determine is just running a small (5 minute) task on different amounts of concurrent processes to see how fast they complete (insert `echo %time%` into the batch files and compare the deltas). As for `()|set /p` trick see this answer: https://stackoverflow.com/a/53455578/8522013 – Jack White Dec 09 '18 at 15:53
  • I would do it like this: Create 20 blank files `1.txt` through `20.txt` in folder `jobs`. Spawn 12 jobs with `start`. When a job finishes it should delete a file from folder `jobs` and create a file in folder `slots`. Check both folders once a second or so. If `jobs` has no files - your computations are finished - exit. For each file `slots` has, delete it and start a new job if any are left to do. See `for /?` for loops, `call /?` for subroutines, `if /?` to check file existence and `set /?` for incrementing a variable. If you have troubles with any of this post a more concrete question. – Jack White Dec 09 '18 at 16:28

1 Answers1

0

I'm thinking this in powershell...

<#
keep a queue of all jobs to be executed

keep a list of running jobs

number of running jobs cannot exceed the throttle value
#>

$throttle = 12

$queue = New-Object System.Collections.Queue
$running = New-Object System.Collections.Generic.List[System.Diagnostics.Process]

# generate x number of queue commands
# runs from 1 to x
1..20 | ForEach-Object {
    # the variable $_ contains the current number

    $program = "matlab"
    $args = "-nosplash -nodesktop -minimize -r `"Worker$_;exit`""

    # args will be
    # -nosplash -nodesktop -minimize -r "Worker1;exit"
    # -nosplash -nodesktop -minimize -r "Worker2;exit"
    # etc

    # save it
    $queue.Enqueue(@($program, $args))
}

# begin executing jobs
while($queue.Count) {
    # remove jobs that are done
    $running.Where({ $_.HasExited }) |
        ForEach-Object { [void]$running.Remove($_) }

    if($running.Count -ge $throttle) {
        # busy, so wait
        Start-Sleep -Milliseconds 50
    }
    else {
        # ready for new job
        $cmd = $queue.Dequeue()
        [void]$running.Add([System.Diagnostics.Process]::Start($cmd[0], $cmd[1]))
    }
}

# wait for rest to be done
while($running.Where({ !$_.HasExited }).Count) {
    Start-Sleep -Milliseconds 50
}
Palansen
  • 311
  • 2
  • 7