8

Possible Duplicate:
Performance difference between functions and pattern matching in Mathematica

I often find a heavy use of pure functions in a lot of the answers posted here, and often those solutions are much faster than using named patterns etc. Why is this so? Why are pure functions faster than others? Does it have to do with the mma interpreter having to do less work?

Community
  • 1
  • 1
  • 1
    related http://stackoverflow.com/questions/4187822/performance-difference-between-functions-and-pattern-matching-in-mathematica/4190348#4190348 – Dr. belisarius Jun 13 '11 at 16:46
  • Damn! It's happening again. The Mma tag gets confused with "maths" – Dr. belisarius Jun 13 '11 at 16:51
  • 1
    where's @Leonid when you need him... – acl Jun 13 '11 at 17:15
  • Sorry I came in a little late :) – Leonid Shifrin Jun 13 '11 at 17:49
  • @belisarius: All of the close votes are for "exact duplicate" - I'm guessing to the link that you provided. – Simon Jun 13 '11 at 22:11
  • @Simon I've struggling all day with close votes in my questions, cast by people who thought I was posting a math question! I don't think this one is a dup, just _related_ – Dr. belisarius Jun 13 '11 at 22:58
  • @belisarius: Yeah, I noticed. I just posted an answer on your `Abs'[x]` question. – Simon Jun 13 '11 at 22:59
  • @all: I couldn't find the dupe, which is why I asked this. My apologies for the trouble. –  Jun 14 '11 at 16:47
  • @d00b: Your question really needs clarifying with concrete examples but I believe the information you want is that built-in functions like `Sqrt` are interpreted directly whereas user defined functions like `f[x_]:=Sqrt[x]` must be resolved by looking up their corresponding `DownValues` in a global hash table (hashed on the symbol name `f`) and then applying the resulting rewrite rule. The latter is a much more complicated form of program evaluation and, consequently, is a lot slower. – J D Nov 26 '11 at 16:45

4 Answers4

9

First, let us consider some sample benchmarks:

In[100]:= f[x_]:=x^2;

In[102]:= Do[#^2&[i],{i,300000}]//Timing
Out[102]= {0.406,Null}

In[103]:= Do[f[i],{i,300000}]//Timing
Out[103]= {0.484,Null}

In[104]:= Do[Function[x,x^2][i],{i,300000}]//Timing
Out[104]= {0.578,Null}

Pure functions are often (much) faster for 2 reasons. First, anonymous pure functions (those defined with the slots - # and &) do not need to resolve name conflicts for variable names. Therefore, they are somewhat faster than pattern-defined ones, where some name conflict resolution takes place. But you see that pure functions with named variables are actually slower, not faster, than pattern-defined ones. I can speculate that this is because they also have to resolve possible conflicts inside their body, while rule-based ones ignore such conflicts. In nay case, speed differences are of the order of 10-20 %.

Another, and much more dramatic, difference is when they are used in functions such as Map, Scan, Table, etc, because the latter auto-compile on large numerical (packed) lists. But while pure functions can often be compiled, pattern-defined ones fundamentally can not, so this speed gain is inaccessible to them. For example:

In[117]:= ff[x_] := Sqrt[x];

In[116]:= Map[Sqrt[#] &, N@Range[100000]]; // Timing

Out[116]= {0.015, Null}

In[114]:= Map[ff, N@Range[100000]]; // Timing

Out[114]= {0.094, Null}
Leonid Shifrin
  • 22,449
  • 4
  • 68
  • 100
  • 1
    those limits (for autocompilation) may be found by SystemOptions["CompileOptions"]. They may be set by eg SetSystemOptions["CompileOptions"->"CompileReportExternal"->True]. this particular example is interesting, but irritating if you use Histogram a lot :) – acl Jun 13 '11 at 18:00
  • Good answer but I don't understand why you say "pattern-defined ones fundamentally can not"? – J D Nov 26 '11 at 15:50
  • Oh, and it might be worth pointing out that "pure function" has a well known meaning and this isn't it. :-) – J D Nov 26 '11 at 15:57
  • @Jon For pattern-defined functions, I meant - within Mathematica. You can probably write a compiler that would somehow compile Mathematica pattern-based functions into some target machine, but reproducing *all* of the semantics of Mathematica pattern-matcher seems to be a pretty hard problem (you probably know much more about this than I do). For the pure functions, I again meant what is understood by "pure function" in Mathematica. – Leonid Shifrin Nov 26 '11 at 16:14
  • 1
    @LeonidShifrin: Yes, there is a huge body of research on pattern match compilation mostly focussed on ML-style patterns but also lots of good stuff on unification. Interesting to think what I'd optimize in MMA if I had the chance... – J D Nov 29 '11 at 01:10
0

Pure function have several advantages : - Result can be cached. - Computation can be safely parralelized. - In some cases, result can be calculated at compilation time (CTFE) and the function never executed at the end. - As the outer scope isn't modified, you don't need to pass all arguments by copy.

So, if the compiler is able to manage the optimization relative the these function, your program will be faster. Whatever the langage is.

deadalnix
  • 2,255
  • 18
  • 24
  • 1
    mathematica does not have a compiler, or at least not of the type you have in mind :) (this is a question about pure functions in mma, not in general, unless I am mistaken) – acl Jun 13 '11 at 16:46
  • mma interpreter can do several of those things, like don't pass argument by copy, but I see no drawback to answer in a general case. – deadalnix Jun 13 '11 at 16:46
  • 1
    @acl Mathematica will compile the function in `Map`, depending on the size of the list being mapped across. (It will do the same for many other functions also, see `SystemOptions["CompileOptions"]` to get a feel for the various limits.) – Brett Champion Jun 13 '11 at 16:59
  • 1
    @Brett thanks. I've spent quite a bit of time playing with whatever internals I can access so I know that (and have decreased the limit in my init.m). maybe I am missing something, but I got the impression this answer referred to compilation as it is used in C or most Lisps etc. In short, I thought the answer isn't answering the question but a different question, regarding the difference between functional and imperative programming (hence references to safe parallelization etc) – acl Jun 13 '11 at 17:06
0

actually, pattern matching seems typically faster than Function[{u},...] constructions and as fast as #&-type constructions (ignoring the possibility of compiling, which has become more exciting in mma 8).

To see this define a function to time short pieces of code:

SetAttributes[timeshort, HoldAll];
timeshort[expr_] := Module[{t = Timing[expr;][[1]], tries = 1},
    While[t < 1.,
        tries *= 2;
        t = Timing[Do[expr, {tries}];][[1]]];
    Return[t/tries]]

then try this:

ClearAll[f]; f[x_] := Cos[x]
Trace[f[5.]]
f[5] // timeshort

ClearAll[f]; f = Function[x, Cos[x]]
Trace[f[5.]]
f[5] // timeshort

ClearAll[f]; f = Cos[#] &
Trace[f[5.]]
f[5] // timeshort

which give

{f[5.],Cos[5.],0.283662}
8.45641\[Times]10^-7
Function[x,Cos[x]]
{{f,Function[x,Cos[x]]},Function[x,Cos[x]][5.],Cos[5.],0.283662}
1.51906\[Times]10^-6
Cos[#1]&
{{f,Cos[#1]&},(Cos[#1]&)[5.],Cos[5.],0.283662}
8.04602\[Times]10^-7

that is, pattern matching and #& are faster than Function. I have no idea why.

EDIT: Guess I should have checked the question belisarius suggested earlier... See here for essentially the same answer as I gave, and read the comments too for some interesting discussion.

Community
  • 1
  • 1
acl
  • 6,490
  • 1
  • 27
  • 33
-1

Yes. It means it never has to copy much stuff, because a pure function can't change it. List processing purely can take a certain amount of copying, but usually the functional algorithms are arranged to be efficient. Once stuff gets compiled down, imperative code will be as quick (almost always), but for interpreted Mathematica bytecode, pure is usually quick.

Nicholas Wilson
  • 9,435
  • 1
  • 41
  • 80
  • Your answer does not seem related (or only vaguely) with the tag. Please see http://www.wolfram.com/mathematica/ – Dr. belisarius Jun 13 '11 at 16:53
  • 1
    in addition to what @belisarius said, part of it is not even true when it comes to mathematica. eg run this and observe a pure function change its argument, in real time and before your very eyes :): lst = {}; Dynamic[lst] Do[Function[{u, i}, AppendTo[u, i], HoldAll][lst, i]; Pause[1];, {i, 3}] &[lst] – acl Jun 13 '11 at 16:56
  • Well, I'm not the greatest Mathematica internals guru, but I do use it quite often. I assumed the OP was thinking more of list processing algorithms, the 'build a Table, process, reduce, extract' type thing, rather than just how fast calls are to pure functions. – Nicholas Wilson Jun 13 '11 at 16:58
  • @acl, that's not a pure function. It's just written using the explicit full form. **Edit**: checked the Mathematica docs; sorry, it does indeed for some reason use the term 'pure function' to describe anonymous functions. I guess I technically can't correct you if you're using the official strange terminology, even if it is 'wrong'. – Nicholas Wilson Jun 13 '11 at 17:01
  • I know, it explicitly has sideeffects :) my point was that the question seems to be about the #&-type construction in mathematica, not pure functions as in sideeffect-free functions, which is what you seem to have taken it to be. I don't disagree with what you wrote at all if that is what the question is about – acl Jun 13 '11 at 17:11