It comes down to how lists and ++
is implemented. You can think of lists being implemented like
data List a = Empty | Cons a (List a)
Just replace []
with Empty
and :
with Cons
. This is a very simple definition of a singly linked list in Haskell. Singly linked lists have a concatenation time of O(n)
, with n
being the length of the first list. To understand why, recall that for linked lists you hold a reference to the head or first element, and in order to perform any operation you have to walk down the list, checking each value to see if it has a successor.
So for every list concatenation, the compiler has to walk down the entire length of the first list. If you have the lists a
, b
, c
, and d
with the lengths n1
, n2
, n3
, and n4
respectively, then for the expression
((a ++ b) ++ c) ++ d
It first walks down a
to construct a ++ b
, then stores this result as x
, taking n1
steps since a
has n1
elements. You're left with
(x ++ c) ++ d
Now the compiler walks down x
to construct x ++ c
, then stores this result as y
in n1 + n2
steps (it has to walk down elements of a
and b
this time). you're left with
y ++ d
Now y
is walked down to perform the concatenation, taking n1 + n2 + n3
steps, for a total of n1 + (n1 + n2) + (n1 + n2 + n3) = 3n1 + 2n2 + n3
steps.
For the expression
a ++ (b ++ (c ++ d))
The compiler starts at the inner parentheses, construction c ++ d -> x
in n3
steps, resulting in
a ++ (b ++ x)
Then b ++ x -> y
in n2
steps, resulting in
a ++ y
Which is finally collapsed in n1
steps, with a total number of steps as n3 + n2 + n1
, which is definitely fewer than 3n1 + 2n2 + n3
.