Benefits of differential lists with lazy evaluation

Question

I struggle to understend why ++ is considered O(n) while differential lists are considered "O(1)".

In case of ++ let's assume it's defined as:

(++) :: [a] -> [a] -> [a]
(a:as) ++ b = a:(as ++ b)
[]     ++ b = b

Now if we need to get an access first element in a ++ b we can do it in O(1) (assuming that a can be made HNF in 1 step), similarly the second etc. It changes with appending multiple lists setting to Ω(1)/O(m), where m is number of unevaluated appendings. Accessing last element can be done with Θ(n + m), where n is length of list, unless I missed something. If we have differential list we also have access to first element in Θ(m) while last element is in Θ(n + m).

What do I miss?

True, `(xs ++ ys) !! n` is "O of n", but you are recreating the elements corresponding to `xs` as you traverse the list. `xs !! n` is also "O of n" but you are just just following pointers which is a much less expensive operation. — ErikR, Feb 01 '14 at 23:54
http://stackoverflow.com/a/13879693/849891 http://www.reddit.com/r/haskell/comments/1w5duf/demystifying_dlist/ — Will Ness, Feb 02 '14 at 11:00
possible duplicate of [Why are difference lists more efficient than regular concatenation?](http://stackoverflow.com/questions/13879260/why-are-difference-lists-more-efficient-than-regular-concatenation) — Maciej Piechotka, Feb 02 '14 at 11:08
@WillNess: I started wondering after reading the second link (but at the time images were misconfigured so message was diluted). I voted to close as duplicate for the second one. — Maciej Piechotka, Feb 02 '14 at 11:09

not my job · Answer 1 · 2014-02-02T09:44:03.940

Performance in theory

The O(1) refers to the fact that append for DLists is just (.) which takes one reduction, wheras (++) is O(n).

Worst case

++ has quadratic performance when you use it to repeatedly add to the end of an existing string, because each time you add another list you iterate through the existing list, so

"Existing long ...... answer" ++ "newbit"

traverses "Existing long ....... answer" each time you append a new bit.

On the other hand,

("Existing long ..... answer" ++ ) . ("newbit"++)

is only going to actually traverse "Existing long ...... answer" once, when the function chain is applied to [] to convert to a list.

Experience says

Years ago when I was a young Haskeller, I wrote a program that was searching for a counterexample to a conjecture, so was outputting data to disk constantly until I stopped it, except that once I took off the testing brakes, it output precisely nothing because of my left-associative tail recursive build-up of a string, and I realised my program was insufficiently lazy - it couldn't output anything until it had appended the final string, but there was no final string! I rolled my own DList (this was in the millenium preceding the one in which the DList library was written), and lo my program ran beautifully and happily churned out reams and reams of non-counterexamples on the server for days until we gave up on the project.

If you mess with large enough examples, you can see the performance difference, but it doesn't matter for small finite output. It certainly taught me the benefits of laziness.

Toy example

Silly example to prove my point:

plenty f = f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f.f

alot f = plenty f.plenty f.plenty f

Let's do the two sorts of appending, first the DList way

compose f = f . ("..and some more.."++)
append xs = xs ++ "..and some more.."

insufficiently_lazy = alot append []
sufficiently_lazy = alot compose id []

gives:

ghci> head $ sufficiently_lazy
'.'
(0.02 secs, 0 bytes)
ghci> head $ insufficiently_lazy
'.'
(0.02 secs, 518652 bytes)

and

ghci> insufficiently_lazy
    -- (much output skipped)
..and some more....and some more....and some more.."
(0.73 secs, 61171508 bytes)

ghci> sufficiently_lazy
    -- (much output skipped) 
..and some more....and some more....and some more.."
(0.31 secs, 4673640 bytes).    
    -- less than a tenth the space and half the time

so it's faster in practice as well as in theory.

score 2 · Answer 2 · answered Feb 02 '14 at 01:14

DLists are often most useful if you're repeatedly appending list fragments. To wit,

foldl1 (++) [a,b,c,d,e] == (((a ++ b) ++ c) ++ d) ++ e

is really bad while

foldr1 (++) [a,b,c,d,e] == a ++ (b ++ (c ++ (d ++ e)))

still is n steps away from the nth position. Unfortunately, you often build strings by traversing a structure and appending to the end of the accumulating string, so the left fold scenario isn't uncommon. For this reason, DLists are most useful in situations where you're repeatedly building up a string such as the Blaze/ByteString Builder libraries.

Ok. But how does the DLists perform better? Both with DList and left fold we need to wait for the final results to arrive and both require thunk/function evaluation for every append. — Maciej Piechotka, Feb 02 '14 at 01:39

score 2 · Answer 3 · edited May 23 '17 at 11:45

[After further thinking and reading other answers I believe I know what went wrong - but I don't think either explained it fully so I'm adding my own.]

Assume you had the lists a1:a2:[] b1:b2:[] and c1:c2:[]. Now you append them (a ++ b) ++ c. That gives:

(a1:a2:[] ++ b1:b2:[]) ++ c1:c2:[]

Now to take a head you need to take O(m) steps where m is number of appendings. This gives thunks as follows:

 a1:((a2:[] ++ b1:b2:[]) ++ c1:c2:[])

To give next element you need to perform m or m-1 steps (I assumed it to be free in my reasoning). So after 2m or 2m-1 steps the view is as follows:

 a1:a2(([] ++ b1:b2:[]) ++ c1:c2:[])

And so on. In worst case it gives m*n time to traverse the list as the traversal of the thunks is done each time.

EDIT - it looks like the answer to duplicate have even better pictures.

Benefits of differential lists with lazy evaluation

3 Answers3

Performance in theory

Worst case

Experience says

Toy example