1

Consider the following two execution orders:

a ++ (b ++ c)

and

(a ++ b) ++ c

Why is the first execution order faster than the second?

I'm new to Haskell, hoping for a detailed explanation, thanks!

Will Ness
  • 70,110
  • 9
  • 98
  • 181
Tri
  • 113
  • 7

1 Answers1

3

When fully evaluating the result of x ++ y, you must pay:

  1. The cost of fully evaluating x.
  2. The cost of fully evaluating y.
  3. The cost of traversing x once while executing the (++) operation.

Now let's compare a ++ (b ++ c) and (a ++ b) ++ c. Let's write the cost of evaluating a as CA (and similarly CB, CC), and the cost of traversing a as TA (and similarly TB, TC). Then the cost of fully evaluating a ++ (b ++ c) is:

  1. The cost of a, CA.
  2. The cost of b ++ c, which is
    1. CB
    2. CC
    3. One traversal of b, TB.
  3. TA

This is a grand total of CA+CB+CC+TA+TB. Now, for (a ++ b) ++ c:

  1. The cost of a ++ b, which is
    1. CA
    2. CB
    3. TA
  2. CC
  3. One traversal of a ++ b, which is TA+TB.

This is a grand total of CA+CB+CC+2*TA+TB. Compared to the other order, there is an extra traversal of a, for an extra cost of TA, so this order is more expensive.

I leave it to the reader to try longer chains and start figuring out the pattern. In short, the bad association gives a quadratic amount of traversal as it redoes all the traversals it has already done, plus one more, at each invocation of (++), while the good association traverses each list at most once.

Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380