8

I can't sleep! :)

I've written small program building double linked list in Haskell. The basic language's property to make it was lazy evaluation (see the bunch of code below). And my question is can I do the same in a pure functional language with eager evaluation or not? In any case, what properties eager functional language must have to be able to build such structure (impurity?)?

import Data.List

data DLList a = DLNull |
    DLNode { prev :: DLList a
           , x :: a
           , next :: DLList a
           }
  deriving (Show)

walkDLList :: (DLList a -> DLList a) -> DLList a -> [a]
walkDLList _ DLNull = []
walkDLList f n@(DLNode _ x _)  = x : walkDLList f (f n)

-- Returns first and last items.
makeDLList :: [a] -> (DLList a, DLList a)
makeDLList xs = let (first, last) = step DLNull xs in (first, last)
  where
    step prev [] = (DLNull, prev)
                         -- Here I use laziness. 'next' is not built yet, it's a thunk.
    step prev (x : xs) = let this = DLNode prev x next 
                             (next, last) = step this xs
                         in (this, last)

testList :: [Int] -> IO ()
testList l = let
    (first, last) = makeDLList l
    byNext = walkDLList next first
    byPrev = walkDLList prev last
  in do
    putStrLn $ "Testing: " ++ show l
    print byNext
    print byPrev

main = do
  testList []
  testList [1, 2, 3, 4]
demi
  • 5,384
  • 6
  • 37
  • 57
  • check out [*this* code](http://rosettacode.org/wiki/Doubly-Linked_List_%28traversal%29#Haskell). :) It returns just the head node, and is exceptionally simple. – Will Ness Feb 14 '12 at 22:51

3 Answers3

6

A doubly-linked list can be implemented in a purely functional way in an eager language as a zipper on a singly-linked list. See, for example, Rosetta Code > Doubly-linked list > OCaml > Functional.

Dan Burton
  • 53,238
  • 27
  • 117
  • 198
  • I think the point of doubly-*linked* list is that it does **not** allocate new storage on traversal, unlike the code you linked to, Dan. In it, `prev(next alist)` is not the same storage as `alist`, unlike the usual C, say, implementation, and also the Haskell code from the question above where after `let (alist,zlist)=makeDLList xs` we have `prev(next alist)==alist` and `next(prev zlist)==zlist` where `==` means *"the same actual storage"*, or in Lisp-parlance *"`eq`-equivalent"* (for non-empty `xs` of course). – Will Ness Feb 14 '12 at 20:14
  • @WillNess regarding "not allocat[ing] new storage on traversal", that is an implementation detail, and comes at a cost. You cannot "insert" into an immutable doubly-linked list (in the style given by OP) without making a copy of the entire list. Well, you can usually get away with less copying due to laziness, but the point is the new doubly-linked list cannot share any nodes with the old doubly-linked list. The immutable zipper implementation allows more sharing, though at the cost of more storage required for traversal (but not *much* more, again due to the ability to share). – Dan Burton Feb 14 '12 at 22:38
  • @WillNess n.b. Haskell makes no guarantees about `eq`-ness. GHC's parallel garbage collector takes advantage of this, and can actually *remove* sharing that you might expect to be there. This is rare, and barely affects performance, though, and due to immutable data, it will never affect your results. – Dan Burton Feb 14 '12 at 22:41
  • I'd expect it that if the same name is used twice, (a pointer to) the same value would be used in each place: in `let {this = DLNode prev x next ; (next, last) = step this xs}`, `this` is `this`, right? Isn't it what *laziness* means? One name - one (pointer to an) entity, not? You're quite right about whole-list copy on inserts. Maybe zipper is better in some situations, and DL-list in others. Still they're different. – Will Ness Feb 14 '12 at 23:29
  • also, if GHC would remove sharing from two uses of same variable, how would it impact the various corecursive definitions which rely on it, like e.g. `fibs = 0:scanl (+) 1 fibs`? Would it even work, or suffer a drastic loss of efficiency? – Will Ness Feb 15 '12 at 08:01
  • @WillNess fib # *n* is typically shared by fib # *n+1* and fib # *n+2*. Suppose the garbage collector removed this sharing, making two copies of the thunk for fib # *n*. There would be minimal loss of efficiency, since the two copies of fib # *n* would probably still share fib # *n-1* and fib # *n-2*. The only loss in efficiency is that *that particular node* would be calculated twice, and the cost per node of a fib calculation is fairly trivial. The same name used twice always represents the same *value*, but perhaps surprisingly, does not *have* to be the same pointer (though it usually is). – Dan Burton Feb 15 '12 at 17:59
  • You suggest only some cells in a fib list might get "unshared", once in a long while. But there is a difference between a ptr to node's value cell, and a ptr to node itself. *This* can't be unshared I suspect, or else the whole list would get split after that point, and all the nodes after that would have to be calculated twice, leading potentially to the terrible inefficiency of the naive doubly-recursive version. Same here, `this` denotes the node itself, so **must** get shared. I.e **be the same**. Without it, any code we write becomes the house of cards. – Will Ness Feb 16 '12 at 09:39
  • Consider this e.g.: `fixc f = f(fixc f)` vs `fixs f = x where x = f x`. First is *"copying"*, second is *"sharing"*. This is essential e.g. here: `fibs = fix ((0:).(1:).next) where next(a:t@(b:_))=(a+b):next t`. With the "sharing" `fix` it runs in linear time as expected, but with the "copying" one, it runs in quadratic time. – Will Ness Feb 16 '12 at 09:48
4

As long as a language has something like closures, lambdas etc. you can always simulate lazyness. You could rewrite that code even in Java (without mutating variables etc), you just need to wrap every "lazy" operation in something like

interface Thunk<A> {
   A eval();  
}

Of course this would look terrible, but it is possible.

Landei
  • 54,104
  • 13
  • 100
  • 195
  • 3
    You need something updatable to get the effect of lazy evaluation (evaluate at most once), otherwise you'll be emulating call-by-name. – augustss Feb 14 '12 at 15:39
  • So, augustus, your point is that it's not enough just closures and lambdas? – demi Feb 14 '12 at 16:48
  • 2
    I think he just refers to that the result of `eval` needs to be cached, so the thunk is not recomputed should `eval` be run multiple times. – danr Feb 14 '12 at 17:22
4

In the non-backtracking subset of Prolog, which can be seen as explicitly set-once eager pure functional language, you can build the doubly-linked lists easily. It's the referential transparency that makes it hard in Haskell, for it forbids the Prolog's explicit setting of the named, explicitly not-yet-set logical variables, and instead forces Haskell to achieve same effect in the warped way of "tying the knot". I think.

Plus, there really isn't much difference between Haskell's guarded recursion under lazy evaluation vs. Prolog's open-ended lists built in tail-recursion modulo cons fashion. IMO. Here's for instance an example of lazy lists in Prolog. The memoized shared storage is used as universal access mediator, so the results of previous calculations can be arranged to be cached.

Come to think of it, you can use C in a restrictive manner as an eager pure functional language, if you never reset any variable nor any pointer once it is set. You still have null pointers, just as Prolog has variables, so it is too, explicitly set-once. And of course you can build doubly-linked lists with it.

So the only question that remains is, do you admit such set-once languages as pure?

Community
  • 1
  • 1
Will Ness
  • 70,110
  • 9
  • 98
  • 181