3

I have a MATLAB program that I want to run in parallel so that it runs faster. However, when I do that parallel workers seem not to be able to access global variables created beforehand. Here is what my code looks like:

createData  % a .m file that creates a global variable (Var)
parfor i:j
   processData()  % a function that is dependent on some global variables
end

However, I get an error message undefined function or variable Var. I've already included a call for global variables global Var inside the function processData() but this is not working either. Is there any way of making global variables visible within the parallel loop?

This is not the same question as here as I declared global variables outside of the parfor loop and want to access them within the loop with out the need to modify or update the its value across workers of the parallel loop.

Adriaan
  • 17,741
  • 7
  • 42
  • 75
Abdallah Atef
  • 661
  • 1
  • 6
  • 14
  • I think this question is different from the link you provided @Adriaan. I've declared the global variable out side the parfor loop and what I want to do is to access the variable within the loop not modify it or update its value across workers – Abdallah Atef Nov 20 '18 at 22:21
  • Sorry, rereading the 2nd answer, read the first sentence: "`GLOBAL` data is hard to use inside `PARFOR` because each worker is a separate MATLAB process, and **global variables are not synchronised from the client (or any other process) to the workers**." emphasis mine. Ergo: the author of the toolbox tells you that setting a `global` outside a `parfor` (or in the body of the loop) will not work. All that will work is setting up a function *inside* the `parfor` setting the `global`. Not ideal, but all you've got. `global` is generally bad anyways, might as well avoid it completely. – Adriaan Nov 20 '18 at 22:31
  • Thanks @Adriaan. I'll try to reformulate with more detail so that I can get more focused answer specific to my situation. – Abdallah Atef Nov 20 '18 at 22:45

2 Answers2

4

The simplest advice is: don't use global for the myriad reasons already described/linked here. Ideally, you would restructure your code like so:

Var = createData(); % returns 'Var' rather than creating a global 'Var'
parfor idx = ...
    % simply use 'Var' inside the parfor loop.
    out(idx) = processData(Var, ...);
end

Note that parfor is smart enough to send Var to each worker exactly once for the above loop. However, it isn't smart enough not to send it across multiple times if you have multiple parfor loops. In that case, I would suggest using parallel.pool.Constant. How you use that depends on the cost of creating Var compared to its size. If it is small, but expensive to create - that implies you're best off creating it only once at the client and sending it to the workers, like this:

cVar = parallel.pool.Constant(Var);

If it is large, but relatively quick to construct, you could consider getting the workers each to construct their own copy independently, like this:

cVar = parallel.pool.Constant(@createData); % invokes 'createData' on each worker
Edric
  • 23,676
  • 2
  • 38
  • 40
  • One more question @Edric, is it possible for the `parallel.pool.Constant` to be accessible by the client. The situation I'm having is the the function `processData()` is called several times in the model. Some of which are within normal `for` loop (unparallelizable part of the model). The function is also called within a `parfor` loop. So, I want that constant to be available to both workers and client as well. Thanks in advance – Abdallah Atef Nov 21 '18 at 22:03
  • Unfortunately, right now, the value held by the `parallel.pool.Constant` doesn't exist at the client. – Edric Nov 22 '18 at 06:23
3

Citing the author of the parallel toolbox:

GLOBAL data is hard to use inside PARFOR because each worker is a separate MATLAB process, and global variables are not synchronised from the client (or any other process) to the workers.

Emphasis mine. So the only way to get a global variable on a worker (which is a bad idea for reasons mentioned in the linked post) is to write a function which sets up the global variables, run that on each worker, then run your own, global-dependent function.

Citing another comment of mine to illustrate why this is a bad idea:

One of the pitfalls in terms of good practise is that you can suddenly overwrite a variable which is used inside a function in other functions. Therefore it can be difficult to keep track of changes and going back and forth between functions might cause unexpected behaviour because of that. This happens especially often if you call your global variables things like h, a etc (this of course makes for bad reading also when the variable is not global)

And finally an article outlining most of the reasons using global variables is generally a bad idea.

Bottom line: what you want is not possible, and generally thought to be bad practise.

Adriaan
  • 17,741
  • 7
  • 42
  • 75