9

Simple question, but I am asking just to make sure I am not overlooking an obvious solution which can be much more efficient.

If one has large data buffer, say very large list, that needs to be updated, and wanted to pass it to a function to do the updating inside the function as in

a = Table[0,{10}]
a = update[a]

and since I can't use pass by reference (in a CDF, one can't change the Atrributes of a function to anything, such as HoldFirst), then I am forced to make a copy of the list inside the function itself in order to update it, and return back the copy.

My question, other than using 'global variables' which is not good, is there a more efficient way to do this?

ps. about a year ago, I asked about copy by reference, here is a link to my Mathgroup question. (Thanks to Leonid answer there btw, was useful answer).

But my question here is a little different, since now I can NOT use HoldFirst, are there any other alternatives that I am not seeing to avoid this extra copying of data all the time, it seems to slow down the program when the size becomes too large.

(can't use SetAttributes and its friends, not allowed in CDF).

I'll show the basic example first, then show how I would do it if I could use the HoldFirst.

Example

update[a_List] := Module[{copyOfa = a}, copyOfa[[1]] = 5; copyOfa]
a = Table[0, {10}];
a = update[a]

----> {5, 0, 0, 0, 0, 0, 0, 0, 0, 0}

If I could use the HoldFirst, I would write

update[a_] := Module[{}, a[[1]] = 5; a]
Attributes[update] = {HoldFirst};

a = Table[0, {10}];
a = update[a]

----> {5, 0, 0, 0, 0, 0, 0, 0, 0, 0}

Much more efficient, since no copying is done. Pass by reference.

I could use a global variable, as in

a = Table[0, {10}];
updateMya[] := Module[{}, a[[1]] = 5]
updateMya[];
a
----> {5, 0, 0, 0, 0, 0, 0, 0, 0, 0}

But this is of course bad programming even if is it very fast.

Since I have large data buffers, and I'd like to modularize my Mathematica code, I need to create functions that I pass it large data to process, but at the same time wanted to keep it 'efficient'.

Any other options one can see to do this?

sorry if this was asked before here, hard to search SO.

thanks,

addition 1

Using Unevaluated is easy to use, but I am no longer able to use the type checking I had to make sure that a list is being passed. For example

update[a_List] := Module[{}, a[[1]] = 5; a]
a = Table[0, {10}];
a = update[Unevaluated[a]]

The call now does not 'bind' to the definition, since 'a' now does not have header List.

So, I lose some of the robustness I had in the code. But using Unevaluated does work in CDF and changing the code to use it was easy. I just had to remove those extra 'type checking' that I had there to make it work.

Verbeia
  • 4,400
  • 2
  • 23
  • 44
Nasser
  • 12,849
  • 6
  • 52
  • 104
  • 1
    Regarding the addition: if you store a list (say) in a variable, *and* want to keep it unevaluated (or, pass that variable to Hold*-attribute carrying function), type-checking becomes harder. I described the problem is some detail here: http://www.mathprogramming-intro.org/book/node408.html. You can also check out this Mathgroup thread: http://groups.google.com/group/comp.soft-sys.math.mathematica/browse_thread/thread/c062579f93e093a6, for a discussion of a similar situation. For the case at hand, a workaround would be `update[a_]/;Head[a]===List:=...`. – Leonid Shifrin Sep 12 '11 at 12:42
  • I just have to mention that the linked discussion in the book is partly misleading as you don't need `Evaluate` inside `Head` (since a new sub-evaluation is induced for that). I should find some time to correct it. – Leonid Shifrin Sep 12 '11 at 13:02
  • @Leonid, thanks for the nice solution to be able to still use the 'type checking' even when Unevaluated is being used. – Nasser Sep 12 '11 at 14:03
  • This solution isn't perfect either, both because it is (sometimes much) less efficient than pattern - matching against `_List` (and similar), and because it does evaluate the argument in `Head`. This does not matter in this case, but may matter if the supplied argument is a code with side effects. For example, for `i=0;update[i++]`, I arguably should not increment `i` in the process of type-checking (so the value of `i` after we execute this code should still be `0`). A safer solution can be worked out, but will be more complicated and will also induce even larger overhead. – Leonid Shifrin Sep 12 '11 at 16:06

2 Answers2

16

The function Unevaluated has pretty much the same effect as (temporarily) setting the Attribute HoldFirst so you could do something like

update[a_] := Module[{}, a[[1]] = 5; a]
a = Table[0, {10}];
a = update[Unevaluated[a]]

Edit

Concerning addition 1: you could add type checking by doing something like

Clear[update];
update[a_] := Module[{}, a[[1]] = 5; a] /; Head[a] == List

Then

a = Table[0, {10}];
update[Unevaluated[a]]

works as before but

b = f[1,2,3];
update[Unevaluated[b]]

just returns the last statement in unevaluated form.

Heike
  • 24,102
  • 2
  • 31
  • 45
  • 2
    +1. Strictly speaking, `Unevaluated` (as well as things like `Sequence` or `Evaluate`) are not true functions. Rather, they are special heads (forms), that affect the main evaluation sequence. In particular, they AFAIK have no built-in rules, and you will find it hard to exactly replicate their behavior with some user-defined code - they are wired in deeper than other built-ins. – Leonid Shifrin Sep 11 '11 at 09:49
  • Thank you for the answer. But I find that when I used it, I no longer able to use the 'type checking' I had before. please see edit 1. – Nasser Sep 12 '11 at 08:19
  • 1
    I'd rather use `===`, as in my comment above. This does not matter much since anything but explicit `True` is considered as `False` by `Condition` / pattern-matcher, so more a stylistic comment. – Leonid Shifrin Sep 12 '11 at 12:55
12

Alternatively, and if CDF allows that, you can use a pure function with a Hold*-attribute, like so:

update = Function[a, a[[1]] = 5; a, HoldFirst]

Then, you use it as usual:

In[1408]:= 
a=Table[0,{10}];
update[a];
a

Out[1410]= {5,0,0,0,0,0,0,0,0,0}

EDIT

Just for completeness, here is another way, which is less elegant, but which I found myself using from time to time, especially when you have several parameters and want to hold more than one (but such that HoldFirst or HoldRest are not good enough, such as first and third, for example): just wrap your parameter in Hold, and document it in the function's signature, like this:

updateHeld[Hold[sym_], value_] := (sym[[1]] = value; sym)

You use it as:

In[1420]:= a=Table[0,{10}];
updateHeld[Hold[a],10];
a

Out[1422]= {10,0,0,0,0,0,0,0,0,0}

EDIT 2

If your main concern is encapsulation, you can also use Module to create persistent local variable and methods to access and modify it, like so:

Module[{a},
   updateA[partIndices__, value_] := a[[partIndices]] = value;
   setA[value_] := a = value;
   getA[] := a
]

It is still (alomost) a global variable from the structural point of view, but there is no danger of name collisions with other variables, and it is easier to track where it is changed, since you only can do it by using the mutator methods above (but not directly). You use it as:

In[1444]:= 
setA[Table[0,{10}]];
updateA[1,5];
getA[]

Out[1446]= {5,0,0,0,0,0,0,0,0,0}

This is like making a simplistic JavaBean in Java - a container for mutable data (a way to encapsulate the state). You will have a slight overhead due to extra method invocations (w.r.t. Hold-attribute or Unevaluated - based methods), and in many cases you don't need it, but in some cases you may want to encapsulate the state like that - it may make your (stateful) code easier to test. Personally, I've done this a few times for UI-programming and in the code related to interfacing with a database.

In the same spirit, you can also share some variables between functions, defining those functions inside the Module scope - in this case, you may not need getter and setter methods, and such global functions with shared state are closures. You may find much more detailed discussion of this in my third post in this MathGroup thread.

Leonid Shifrin
  • 22,449
  • 4
  • 68
  • 100