Consider the following two execution orders:
a ++ (b ++ c)
and
(a ++ b) ++ c
Why is the first execution order faster than the second?
I'm new to Haskell, hoping for a detailed explanation, thanks!
Consider the following two execution orders:
a ++ (b ++ c)
and
(a ++ b) ++ c
Why is the first execution order faster than the second?
I'm new to Haskell, hoping for a detailed explanation, thanks!
When fully evaluating the result of x ++ y
, you must pay:
x
.y
.x
once while executing the (++)
operation.Now let's compare a ++ (b ++ c)
and (a ++ b) ++ c
. Let's write the cost of evaluating a
as CA (and similarly CB, CC), and the cost of traversing a
as TA (and similarly TB, TC). Then the cost of fully evaluating a ++ (b ++ c)
is:
a
, CA.b ++ c
, which is
b
, TB.This is a grand total of CA+CB+CC+TA+TB. Now, for (a ++ b) ++ c
:
a ++ b
, which is
a ++ b
, which is TA+TB.This is a grand total of CA+CB+CC+2*TA+TB. Compared to the other order, there is an extra traversal of a
, for an extra cost of TA, so this order is more expensive.
I leave it to the reader to try longer chains and start figuring out the pattern. In short, the bad association gives a quadratic amount of traversal as it redoes all the traversals it has already done, plus one more, at each invocation of (++)
, while the good association traverses each list at most once.