26

Why does cons work in this context with lazy-seq, but conj doesn't?

This works:

(defn compound-interest [p i]
   (cons p (lazy-seq (compound-interest (* p (+ 1 i)) i))))

This doesn't (it gives a stack overflow exception):

(defn compound-interest2 [p i]
   (conj (lazy-seq (compound-interest2 (* p (+ 1 i)) i)) p))
Ryan M
  • 18,333
  • 31
  • 67
  • 74
caleb
  • 2,687
  • 30
  • 25
  • 1
    For others who come here: my answer below is very detailed (perhaps too detailed) and may be confusing. Let me repeat the main point here: the semantics of `conj` require it to vary its behavior based on the collection type. That requires using a (polymorphic) method call on the collection object. A `LazySeq` handles that method call by delegating to its internal value, which requires realizing that internal value. In contrast, the semantics of cons don't require it to call any methods on the collection; it just has to store it in one field of a `Cons` object. – Alex D Sep 12 '12 at 18:26
  • The last three paragraphs of the answer are particularly helpful. Some people might want to read them first. – Mars Feb 15 '14 at 20:08

1 Answers1

47

(conj collection item) adds item to collection. To do that, it needs to realize collection. (I'll explain why below.) So the recursive call happens immediately, rather than being deferred.

(cons item collection) creates a sequence which begins with item, followed by everything in collection. Significantly, it doesn't need to realize collection. So the recursive call will be deferred (because of using lazy-seq) until somebody tries to get the tail of the resulting sequence.

I'll explain how this works internally:

cons actually returns a clojure.lang.Cons object, which is what lazy sequences are made of. conj returns the same type of collection which you pass it (whether that is a list, vector, or whatever else). conj does this using a polymorphic Java method call on the collection itself. (See line 524 of clojure/src/jvm/clojure/lang/RT.java.)

What happens when that Java method call happens on the clojure.lang.LazySeq object which is returned by lazy-seq? (How Cons and LazySeq objects work together to form lazy sequences will become clearer below.) Look at line 98 of clojure/src/jvm/clojure/lang/LazySeq.java. Notice it calls a method called seq. This is what realizes the value of the LazySeq (jump to line 55 for the details).

So you could say that conj needs to know exactly what kind of collection you passed it, but cons doesn't. cons just requires that the "collection" argument is an ISeq.

Note that Cons objects in Clojure are different from "cons cells" in other Lisps -- in most Lisps, a "cons" is just an object which holds 2 pointers to other arbitrary objects. So you can use cons cells to build trees, and so on. A Clojure Cons takes an arbitrary Object as head, and an ISeq as tail. Since Cons itself implements ISeq, you can build sequences out of Cons objects, but they can just as well point to vectors, or lists, etc. (Note that a "list" in Clojure is a special type (PersistentList), and is not built from Cons objects.) clojure.lang.LazySeq also implements ISeq, so it can be used as the tail ("cdr" in Lisps) of a Cons. A LazySeq holds a reference to some code which evaluates to an ISeq of some kind, but it doesn't actually evaluate that code until required, and after it does evaluate the code, it caches the returned ISeq and delegates to it.

...is this all starting to make sense? Do you get the idea of how lazy sequences work? Basically, you start with a LazySeq. When the LazySeq is realized, it evaluates to a Cons, which points to another LazySeq. When that one is realized... you get the idea. So you get a chain of LazySeq objects, each holding (and delegating to) a Cons.

About the difference between "conses" and "lists" in Clojure, "lists" (PersistentList objects) contain a cached "length" field, so they can respond to count in O(1) time. This wouldn't work in other Lisps, because in most Lisps, "lists" are mutable. But in Clojure they are immutable, so caching the length works.

Cons objects in Clojure don't have a cached length -- if they did, how could they be used to implement lazy (and even infinite) sequences? If you try to take the count of a Cons, it just calls count on its tail, and then increments the result by 1.

Salam
  • 3
  • 3
Alex D
  • 29,755
  • 7
  • 80
  • 126
  • Do you mean realizing item :) – Ivan Koblik Sep 12 '12 at 14:22
  • @Ivan: No -- the order of arguments for `cons` and `conj` are opposite. – Alex D Sep 12 '12 at 14:25
  • Yeap sorry I thought that he wrote `(conj p (lazy-seq (compound-interest2 (* p (+ 1 i)) i)))`. – Ivan Koblik Sep 12 '12 at 14:35
  • 1
    So because `conj` works for all seqs (not just lists), it makes the blanket assumption that the collection must be realized (which would be needed if it were, say, a vector). In theory, if the collection were a list though, `conj` could potentially work the same as `cons` right? (Or am I missing something?) – caleb Sep 12 '12 at 14:48
  • 1
    If `conj` was specialized to work only on lists, you couldn't use it on `LazySeq`s anyways, so the whole question would become irrelevant. Basically what it comes down to is: the semantics of `conj` require it to vary its behavior based on the collection type. That requires using a (polymorphic) method call on the collection object. `LazySeq` handles that method call by delegating to its internal value, which requires realizing that internal value. In contrast, the semantics of `cons` don't require it to call any methods on the collection; it just has to store it in one field of a `Cons`. – Alex D Sep 12 '12 at 18:23
  • 2
    Great answer. Among other things, it also provides an answer to another question: Why does Clojure have Cons, if it's already got PersistentLists, which have a length field missing from Cons'es? – Mars Nov 18 '13 at 16:38
  • @Mars isn't it because the missing length field allows for lazy evaluation? Just from what I read in the answer. – Rob Grant Mar 04 '14 at 10:16
  • "... so they can respond to (count) in O(1) time."---are you sure? – 象嘉道 Dec 17 '15 at 21:44
  • @xando I haven't read the Clojure platform source code for a while. It may have changed in the meantime. Last time I read it, this is how it worked. – Alex D Dec 19 '15 at 04:20