What scope should calls to lazy-seq have?

Question

I'm writing a lazy implementation of the Recamán's Sequence, and ran into some confusion regarding where calls to lazy-seq should happen.

This first version I came up with this morning was:

(defn lazy-recamans-sequence []
  (let [f (fn rec [n seen last-s]
            (let [back (- last-s n)
                  new-s (if (and (pos? back) (not (seen back)))
                          back
                          (+ last-s n))]
              (lazy-seq ; Here
                (cons new-s (rec (inc n) (conj seen new-s) new-s)))))]
    (f 0 #{} 0)))

Then I realized that my placement of lazy-seq was kind of arbitrary, and that it could be placed higher to wrap more of the computations:

(defn lazy-recamans-sequence2 []
  (let [f (fn rec [n seen last-s]
            (lazy-seq ; Here
              (let [back (- last-s n)
                    new-s (if (and (pos? back) (not (seen back)))
                            back
                            (+ last-s n))]
                  (cons new-s (rec (inc n) (conj seen new-s) new-s)))))]
    (f 0 #{} 0)))

Then I looked back on a review that someone gave me last night:

(defn recaman []
  (letfn [(tail [previous n seen]
            (let [nx (if (and (> previous n) (not (seen (- previous n))))
                       (- previous n)
                       (+ previous n))]
              ; Here, inside "cons"
              (cons nx (lazy-seq (tail nx (inc n) (conj seen nx))))))]
    (tail 0 0 #{})))

And they have theirs inside of the call to cons!

Thinking this over, it seems like it wouldn't make a difference. With a broader scope (like the second version), more code is inside the explicit function that's passed to LazySeq. With a narrower scope however, the function itself may be smaller, but since the passed function involves a recursive call, it will be executing the same code anyways.

They seem to preform nearly identically and give the same answers. Is there any reason to prefer placing lazy-seq in one place over another? Is this simply a stylistic choice, or can this have actual repercussions?

You might take a look at my answer: https://stackoverflow.com/a/44102122/597473 - in the section titled 'Is there any practical difference with this'. — Piotrek Bzdyl, Aug 14 '18 at 16:17
@PiotrekBzdyl Mm, that last bit makes sense. So, if each element is cheap to calculate and/or you don't care if one element is unnecessarily calculated, there isn't really a difference? But for more expensive calculations, as much as possible should be wrapped to delay computations until absolutely necessary? — Carcigenicate, Aug 14 '18 at 16:36
I would say make it fully lazy by default unless you need (i doubt it) this semantic: the first element of the produced lazy seq must be realized at the same time the lazy seq is created. Otherwise the difference will be one extra call to calculate the first element which should be cheap. — Piotrek Bzdyl, Aug 14 '18 at 16:58
@PiotrekBzdyl Good call, thanks. If you want to make an answer for specifically this question, I'd be happy to accept it. If not, when I get back from a hike I can write something up, or wait and see if someone can write a good description of the differences. — Carcigenicate, Aug 14 '18 at 17:08
I think the answer I have mentioned already covers the differences but feel free to provide your own answer. — Piotrek Bzdyl, Aug 14 '18 at 17:10

score 3 · Accepted Answer · answered Aug 14 '18 at 21:27

In the first two examples the lazy-seq wraps the cons call. This means that when you generate call the function you return a lazy sequence immediately without calculating the first item of the sequence.

In the first example the let expression is still outside of lazy-seq so the value of the first item is calculated immediately but the returned sequence is still lazy and not realized.

The second example is similar to the first. The lazy-seq wraps the cons cell and also the let block. This means that the function will return immediatetly and the value of the first item is calculated only when the caller starts to consume the lazy sequence.

In the third example the value of the first item in the list is calculated immediately and only the tail of the returned sequence is lazy.

Is there any reason to prefer placing lazy-seq in one place over another?

It depends on what you want to achieve. Do you want to return a sequence immediately without calculating any values? In this case make the scope of lazy-seq as broad as possible. Otherwise try to restrict the scope of lazy-seq to calculate only the tail part of the sequence.

I would argue that the second example is the one that's different, not the third. In example #1 you claim we "return a lazy sequence immediately without calculating the first item of the sequence", but in fact the first item is calculated before returning the lazy-seq. It's just stored in the closure of the lazy-seq instead of being stored in a cons cell. The first example is equivalent to the third in terms of how much work is done when. — amalloy, Aug 15 '18 at 22:25

score 0 · Answer 2 · answered Aug 14 '18 at 22:12

When I was first learning Clojure, I was a bit confused by the many possible choices of lazy-seq constructs, the lack of clarity in terms of which construct to choose, and the somewhat vague explanation for how lazy-seq creates laziness in the first place (it is implemented as a Java class of ~240 lines).

To reduce repetition and keep things as simple as possible, I created the lazy-cons macro. It is used like so:

(defn lazy-countdown [n]
  (when (<= 0 n)
    (lazy-cons n (lazy-countdown (dec n)))))

(deftest t-all
  (is= (lazy-countdown  5) [5 4 3 2 1 0] )
  (is= (lazy-countdown  1) [1 0] )
  (is= (lazy-countdown  0) [0] )
  (is= (lazy-countdown -1) nil ))

This version does realize the initial value n immediately.

I never worry about chunking (typically batches of 32) or trying to precisely control the number of elements realized in a lazy sequence. IMHO, if you need fine-grained control such as this, it is better to use an explicit loop than to make assumptions on the timing of realizations in a lazy sequence.

What scope should calls to lazy-seq have?

2 Answers2