Concatenation order and performance under Lazy Evaluation in Haskell

Question

Consider the following two execution orders：

a ++ (b ++ c)

and

(a ++ b) ++ c

Why is the first execution order faster than the second?

I'm new to Haskell, hoping for a detailed explanation, thanks!

How do you know it's faster? – Fyodor Soikin Mar 03 '20 at 02:27 — Fyodor Soikin, Mar 03 '20 at 02:27

Daniel Wagner · Accepted Answer · 2020-03-03T04:00:48.810

When fully evaluating the result of x ++ y, you must pay:

The cost of fully evaluating x.
The cost of fully evaluating y.
The cost of traversing x once while executing the (++) operation.

Now let's compare a ++ (b ++ c) and (a ++ b) ++ c. Let's write the cost of evaluating a as CA (and similarly CB, CC), and the cost of traversing a as TA (and similarly TB, TC). Then the cost of fully evaluating a ++ (b ++ c) is:

The cost of a, CA.
The cost of b ++ c, which is
1. CB
2. CC
3. One traversal of b, TB.
TA

This is a grand total of CA+CB+CC+TA+TB. Now, for (a ++ b) ++ c:

The cost of a ++ b, which is
1. CA
2. CB
3. TA
CC
One traversal of a ++ b, which is TA+TB.

This is a grand total of CA+CB+CC+2*TA+TB. Compared to the other order, there is an extra traversal of a, for an extra cost of TA, so this order is more expensive.

I leave it to the reader to try longer chains and start figuring out the pattern. In short, the bad association gives a quadratic amount of traversal as it redoes all the traversals it has already done, plus one more, at each invocation of (++), while the good association traverses each list at most once.

So that’s probably the reason why `++` is `infixr` and not `infixl`! — MikaelF, Mar 03 '20 at 03:56

Concatenation order and performance under Lazy Evaluation in Haskell

1 Answers1