11

There are two obvious ways to structure a linked list in Mathematica, "left":

{1, {2, {3, {4, {5, {6, {7, {}}}}}}}}

And "right":

{{{{{{{{}, 7}, 6}, 5}, 4}, 3}, 2}, 1}

These can be made with:

toLeftLL = Fold[{#2, #} &, {}, Reverse@#] & ;

toRightLL = Fold[List, {}, Reverse@#] & ;

If I use these, and do a simple ReplaceRepeated to walk through the linked list, I get drastically different Timing results:

r = Range[15000];
left = toLeftLL@r;
right = toRightLL@r;

Timing[i = 0; left //. {head_, tail_} :> (i++; tail); i]
Timing[i = 0; right //. {tail_, head_} :> (i++; tail); i]

(* Out[6]= {0.016, 15000} *)

(* Out[7]= {5.437, 15000} *)

Why?

Mr.Wizard
  • 24,179
  • 5
  • 44
  • 125
  • I guess it can be faster because of tail call optimization. – Andrey Apr 27 '11 at 00:29
  • Check this one: http://stackoverflow.com/questions/4481301/tail-call-optimization-in-mathematica – Andrey Apr 27 '11 at 00:33
  • @Mr. Wizard: Could you break down the RHS of your `RuleDelayed`. Although I _think_ I sort of see how it walks through the list, it's not entirely clear. Also, if I replace `tail` in the RHS with `tail-tail+tail`, I get an error: `$RecursionLimit::reclim: Recursion depth of 256 exceeded. >>` and need to abort. Why doesn't mma figure out that `tail-tail+tail=tail` and return the same result as before? – abcd Apr 27 '11 at 01:44
  • @yoda, for a "left" list, `{head_, tail_} :> (i++; tail)` increments `i` and returns the rest of the linked list, without the first element (head), e.g. `{2, {3, {4, {5, {6, {7, {}}}}}}}` if used on the first list in my question. I increment `i` simply to prove that this replacement took place 15,000 times in each case. The pattern `head_` was used only for clarity and could be replaced with `_` just as well. Since `tail` is a list structure, and arithmetic operations thread through such trees, you are doing up to 14,999 operations rather than one with each `+` or `-`. – Mr.Wizard Apr 27 '11 at 02:09
  • aah `//.`!! I wasn't careful in noticing it and was trying to wrap my head around how the walk-through is done with `/.` That didn't make much sense! Now that I see it, it's clear! Thanks for the explanation on the second part of the comment. – abcd Apr 27 '11 at 04:01
  • @Mr. Wizard: Just bumped you up to 4k :) – abcd Apr 27 '11 at 19:46
  • @yoda oh, it was you! Thank you! :-) – Mr.Wizard Apr 27 '11 at 20:07
  • Interesting behavior. The stuff you guys come up with keeps me entertained. :) – telefunkenvf14 Jun 20 '11 at 21:11

1 Answers1

8

ReplaceRepeated uses SameQ to determine when to stop applying the rule.

When SameQ compares two lists, it checks lengths, and if the same, then applies SameQ to elements from the first to the last. In the case of left the first element is an integer, so it is easy to detect distinct lists, while for right list the first element is the deeply nested expression, so it needs to traverse it. This is the reason for the slowness.

In[25]:= AbsoluteTiming[
 Do[Extract[right, ConstantArray[1, k]] === 
   Extract[right, ConstantArray[1, k + 1]], {k, 0, 15000 - 1}]]

Out[25]= {11.7091708, Null}

Now compare this with:

In[31]:= Timing[i = 0; right //. {tail_, head_} :> (i++; tail); i]

Out[31]= {5.351, 15000}


EDIT In response to Mr.Wizard's question of options to speed this up. One should write a custom same testings. ReplaceRepeated does not provide such an option, so we should use FixedPoint and ReplaceAll:
In[61]:= Timing[i = 0; 
 FixedPoint[(# /. {tail_, _} :> (i++; tail)) &, right, 
  SameTest -> 
   Function[
    If[ListQ[#1] && ListQ[#2] && 
      Length[#1] == 
       Length[#2], (#1 === {} && #2 === {}) || (Last[#1] === 
        Last[#2]), #1 === #2]]]; i]

Out[61]= {0.343, 15000}


EDIT2: Faster yet:
In[162]:= Timing[i = 0; 
 NestWhile[Function[# /. {tail_, head_} :> (i++; tail)], right, 
  Function[# =!= {}]]; i]

Out[162]= {0.124, 15000}
Sasha
  • 5,935
  • 1
  • 25
  • 33
  • Why do you think that `SameQ` is involved here? Turning on tracing of `SameQ` does not show any calls to it: `On[SameQ];`. – Alexey Popkov Apr 27 '11 at 03:04
  • 2
    `On[SameQ]` will only shows evaluations of `SameQ` symbol. `ReplaceRepeated` does not call the evaluator for efficiency when determining that `ReplaceRepeated` should terminate. – Sasha Apr 27 '11 at 03:09
  • That's what I call an author-itative answer – Dr. belisarius Apr 27 '11 at 03:40
  • @Alexey Sacha perhaps is not _thinking_ : http://www.linkedin.com/pub/oleksandr-pavlyk/4/403/39a – Dr. belisarius Apr 27 '11 at 03:41
  • @Sasha Is it possible for a user to define functions which use such functions as `SameQ` without calling the evaluator? – Alexey Popkov Apr 27 '11 at 03:52
  • Interesting! Is there a way to speed this up? A different pattern perhaps? – Mr.Wizard Apr 27 '11 at 03:52
  • As I consider this, I realize it is quite far reaching. Is there a way to affect the evaluation order of SameQ (or its internal counterpart)? I may have to revise some of my programs in light of this. – Mr.Wizard Apr 27 '11 at 04:04
  • 1
    @Alexey No, all user provided code is executed through the evaluator. This is how the language interpreter works. – Sasha Apr 27 '11 at 04:12
  • 1
    @Mr.Wizard I have given one possibility to speed up the code in my edit to the answer. – Sasha Apr 27 '11 at 04:13
  • @Sasha Does the evaluator also use `SameQ` to determine whether the evaluation should be finished because the expression does not change anymore? – Alexey Popkov Apr 27 '11 at 04:20
  • @Alexey Evaluator stops when it sees that no transformations apply. It does not need to use the `SameQ`. When a rule has been applied, the entire result gets evaluated until no more rules can be applied. I think this is very nicely written in [Principles of Evaluation](http://reference.wolfram.com/mathematica/tutorial/PrinciplesOfEvaluation.html) – Sasha Apr 27 '11 at 04:32
  • @Sasha As I understand in such cases as for example `a + b` when `a` and `b` are undefined *Mathematica* has some rules to be applied to this expression and infinite evaluation does no occur because it also checks whether applying the rules changes the expression. [Here](http://reference.wolfram.com/mathematica/tutorial/ControllingInfiniteEvaluation.html) is stated: "*Mathematica* may think that the evaluation is finished because the expression did not change." How *Mathematica* determines that the expression did not change? – Alexey Popkov Apr 27 '11 at 04:50
  • @Sasha I just realized that the statement ["*Mathematica* may think that the evaluation is finished because the expression did not change."](http://reference.wolfram.com/mathematica/tutorial/ControllingInfiniteEvaluation.html) is probably about *pattern rule* expression, not about the result of the evaluation. It is just an ambiguous statement. – Alexey Popkov Apr 27 '11 at 05:06
  • 1
    You can get the same performance as the "left" case if you use standard evaluation instead of "ReplaceRepeated": `loop@{t_,h_}:=(i++;loop@t);Block[{$RecursionLimit=Infinity},Timing[i=0;loop@right;i]]` – WReach Apr 27 '11 at 18:04
  • @WReach Indeed, which is why I was quite surprised to see the massive slowdown with `ReplaceRepeated`; I had not considered the traversal path of comparison. – Mr.Wizard Apr 27 '11 at 20:21