2

I am trying to create a function that stores a very large variable for use each time the function is called. I have a function myfun(x,y) where y is very large, and therefore slow to execute because MATLAB is pass-by-value. However, I only pass the variable y once during execution of a program, creating a closure that is then passed off to another function to call repeatedly:

Y = create_big_matrix();
newfun = @(x) myfun(x,Y);
some_other_fun(newfun); % This calls newfun several times

I assume that each time newfun is called, it copies the stored value of Y to myfun. This seems very inefficient. Is there a better way to implement newfun so that Y is only copied once, when the newfun is created (and maybe when it's passed to some_other_fun)?

David Pfau
  • 793
  • 9
  • 23
  • aside from [COW](http://stackoverflow.com/a/7233424), you could wrap the data inside a handle-class object (which has reference semantics). – Amro Nov 20 '13 at 05:36

2 Answers2

1

MATLAB has copy-on-write mechanisms that prevent a copy of Y when myfun(x,Y) is called, unless it modifies the input. I do not think you need to worry about performance issues in this case, but I would have to see the code for myfun and run tests to verify.

Quoting this post on a MathWorks blog:

Some users think that because MATLAB behaves as if data are passed by value (as opposed to by reference), that MATLAB always makes copies of the inputs when calling a function. This is not necessarily true.

The article goes on to describe that MATLAB has limited abilities to recognize when a variable is modified by a function and avoids copies when possible. See also this SO answer summarizing the "under the hood" operations during copy-on-write as described in an old newsreader post.

To investigate whether a copy is taking place, use format debug to check the data pointer address pr for Y in the calling function, and again inside myfun. If it is the same, no copy is taking place. Set a breakpoint to step through and examine this pointer.

Community
  • 1
  • 1
chappjc
  • 30,359
  • 6
  • 75
  • 132
  • This nature of this behavior has also changed over the years and so is version-dependent. – horchler Nov 19 '13 at 23:55
  • @horchler True. COW has been around in some form [since at least 2000](http://www.mathworks.com/matlabcentral/newsreader/view_thread/21145#51439), but I'm assuming it has only gotten better over the years. Are you referring to any specific idiosyncrasies of different MATLAB versions? I think the best approach is to keep your code natural, keeping in mind situations in which MATLAB optimizations would be likely to fail. – chappjc Nov 19 '13 at 23:59
  • I'm talking under the hood, mex, and that the JIT has also changed how some stuff is handled (your first link points this out). The JIT enhancements make it easier to write efficient code that doesn't make copies that are never used, for example. – horchler Nov 20 '13 at 00:09
  • @horchler Just to clarify, the COW mechanism does not appear to be directly related to the JIT compiler feature, which is as you point out unambiguously related to loop acceleration and inline data processing (as in the second part of the linked page). COW seems to be a basic feature of the MATLAB memory manager, at least with R2013b (test with [this procedure](http://www.mathworks.com/matlabcentral/answers/152) using `feature jit off` and `format debug`). Correct me if I'm wrong. Perhaps I just haven't seen a reference linking JIT and COW. :) – chappjc Nov 20 '13 at 00:24
  • MathWorks often refers to this as JIT/accelerator, but I believe those are really two separate things (you can turn each on or off using `feature jit off` and `feature accel off`). The way I see it, the JIT part has to do with detecting "hot" spots in the code at runtime and dynamically compiling them into an intermediate byte code (usually faster), that way it is immediately executed the next time it is encountered without the usual cost of interpreted languages (think statements inside a loop). – Amro Nov 20 '13 at 05:33
  • ... On the other hand, the accelerator is more of a general optimizer that detects certain cases (like copy-on-write, in-place operations, etc..) and generates optimized code accordingly. But then again, I could be totally wrong here :) See this: http://www.mathworks.com/matlabcentral/newsreader/view_thread/277832 – Amro Nov 20 '13 at 05:33
  • @Amro Thanks for the further info and the link - I haven't seen that particular newsreader post. Nevertheless, I still see lazy copy behavior even with `feature accel off`. :) Perhaps COW has been folded into the standard memory manager? Anyway, I wish there were more thorough documentation of how the acceleration features and JIT relate to observable behavior. – chappjc Nov 20 '13 at 06:00
0

Since Y is not a parameter of the function handle, it is only passed to the function handle once, namely at the time of it's execution - including all fancy copy-on-write optimizations there might be. So the function_handle gets its own version of Y right at the time of its definition. During execution of the function_handle Y is not passed to the function_handle at all. You can visualize that with a easy example:

>> r = rand();
% at this point the function_handle gets lazy copy of r, stored in its own workspace
>> f = @() r; 
>> r
r =
    0.6423
% what happens to the original r, doesn't matter to the function-handle:
>> clear r
>> f()
ans =
    0.6423

An actual copy should only be made if the original Y gets modified / deleted (where it's really copied on delete might be another question).

You can check whether or not Y gets really copied into the function_handle workspace using functions, with the above example (only with a different value for r):

r =

Structure address = c601538 
m = 1
n = 1
pr = 4c834a60 
pi = 0
    0.4464
>> fcn = functions(f);
>> fcn.workspace{1}.r
ans =

Structure address = c6010c0 
m = 1
n = 1
pr = 4c834a60 
pi = 0
    0.4464

You can see that, as discussed in chappjc's answer, the parameter is only lazily copied at this time - so no full copy is created.

sebastian
  • 9,526
  • 26
  • 54