2

I'm trying to execute a code using 12 parallel workers with parfor in Matlab-r2014b. I am allowed to do it since the machine I'm using has 12 cores.

However when I start running the code, after 10-15 minutes, the symbol on the bottom left of the Matlab screen [next to the word "Busy"] (which is usually a blue or green rectangle when parfor is properly working) becomes grey with a yellow triangle. If I pass the mouse over it I get a message saying "Parallel pool shut down due to error". The Matlab command window does not report any error and the code keeps working I guess using just one worker.

Any idea of the possible causes of the message?

Shayan Modiri
  • 55
  • 1
  • 6
TEX
  • 2,249
  • 20
  • 43
  • Never seen that error myself. Can you post the code so I can take a look at it? – Adriaan Aug 21 '15 at 13:42
  • Unfortunately I can't, it's too long. – TEX Aug 21 '15 at 13:46
  • Under Editor-> breakpoints you can select 'dbstop on error'. Not sure if this will work if you do not get an explicit error, but worth a try I guess. – Adriaan Aug 21 '15 at 13:47
  • Have you tried using fewer workers? I have no idea whether this will work. It was just a thought. – Nicky Mattsson Aug 21 '15 at 15:01
  • 1
    With 12 Workers you are likely to run out of memory. Check your temp directory, sometimes workers write dumpfiles there when they crash. – Daniel Aug 21 '15 at 17:05
  • I had the same issue today on a dual Xeon (28 physical cores) with 256GB memory running a GA optimization, super helpful message right? – daaxix Feb 05 '17 at 20:35

1 Answers1

0

There might be several reasons for the error, while using the parallel toolbox in Matlab.

I would try these one by one to find the error:

  1. This problem might be caused by the memory limitation. Since you have 12 cored, Matlab would try to assign 12 workers and it requires copying some of the variables 12 times in the memory. Try to start with 2 or 3 workers to see if the problem vanishes or not. Call this code before parfor to assign poolsize workers:

    parpool('local', poolsize);
    

    For more details you can see this link from mathWorks.

  2. Try running your code with regular for to see if it still gives you an error or not. In my experience, the error usually appears in the last loop variables. Try your loop using fliplr.

    % Replace:
    parfor iLoop = 1 : 100
        % "What you do in the loop"
    end
    
    % With:
    for iLoop = fliplr(1 : 100)
        % "What you do in the loop"
    end
    

    Please note that this replacement is used for debug purposes only and you can use the parfor after finding the possible problem in your problem.

  3. Make sure to pre-define all the variables used in your parfor. It reduces the memory overhead.

  4. Avoid using any explicitly global defined variables in the parfor. Global variables are not allowed in the parfor and inside any functions called while running the parfor.

Shayan Modiri
  • 55
  • 1
  • 6