3

I am just learning Clojure and, as usual when lerning new programming languages, one of the first things I tried is implementing the Sieve of Eratosthenes.

I came up with the following solution:

(defn primes
 "Calculate all primes up to the given number"
 [n]
 (loop
   [
    result []
    numbers (range 2 (inc n))
    ]
   (if (empty? numbers)
     result
     (let [[next & rest] numbers]
       (recur (conj result next) (filter (fn [n] (not= 0 (mod n next))) rest)))
     )
   )
 )

It works fine and quite fast for small numbers but for large inputs a StackOverflowError is raised with a suspiciously short stacktrace, eg.:

(primes 100000)
Execution error (StackOverflowError) at (REPL:1).
null
(pst)
StackOverflowError 
    clojure.lang.LazySeq.sval (LazySeq.java:42)
    clojure.lang.LazySeq.seq (LazySeq.java:51)    
    clojure.lang.RT.seq (RT.java:531)
    clojure.core/seq--5387 (core.clj:137)    
    clojure.core/filter/fn--5878 (core.clj:2809)
    clojure.lang.LazySeq.sval (LazySeq.java:42)
    clojure.lang.LazySeq.seq (LazySeq.java:51)
    clojure.lang.RT.seq (RT.java:531)
    clojure.core/seq--5387 (core.clj:137)
    clojure.core/filter/fn--5878 (core.clj:2809)
    clojure.lang.LazySeq.sval (LazySeq.java:42)
    clojure.lang.LazySeq.seq (LazySeq.java:51)
=> nil

I was under the impression that recur implements tail recursion if it is evaluated last in a loop form and my first question is if this is really the case here. My second question is why the stack trace is so short for a StackOverflowError. I also have problems interpreting the stacktrace, ie. what line corresponds to what form.

I am only interested in better or more Clojure-like solutions if they provide insights for these questions, since otherwise I would like to find them by myself. Thank you!

Mark Schäfer
  • 925
  • 1
  • 12
  • 23
  • 1
    The thing is filter is lazy, meaning that it doesn't call the filter function until the first element is needed. It stacks the filter function calls, meaning you would get 100000 recursive calls, blowing the stack. So, what you need is the eager filter, namely changing filter to filterv. Should do the trick – leetwinski Feb 20 '20 at 20:53

2 Answers2

4

Slightly modified, with comments to describe what is happening on each line, this is your function:

(defn primes
  "Calculate all primes up to the given number"
  [n]
  ;; `loop` is not lazy, it runs until it produces a result:
  (loop [result []
         ;; a lazy sequence implemented by clojure.lang.LongRange:
         numbers (range 2 (inc n))]
    (if (not (nil? (seq numbers)))
      result
      (let [current (first numbers)
            remaining (rest numbers)]
        (recur
         ;; `conj` on a vector returns a vector (non-lazy):
         (conj result current)
         ;; `filter` on a lazy sequence returns a new lazy sequence:
         (filter (fn [n] (not= 0 (mod n next)))
                 remaining))))))

The key is that filter at the end.

Most lazy sequence operations such as filter work by wrapping one lazy sequence in another. On each iteration of the loop, filter adds another layer of lazy sequence, like this:

(filter (fn [n] (not= 0 (mod n 5)))       ; returns a LazySeq
  (filter (fn [n] (not= 0 (mod n 4)))     ; returns a LazySeq
    (filter (fn [n] (not= 0 (mod n 3)))   ; returns a LazySeq
      (filter (fn [n] (not= 0 (mod n 2))) ; returns a LazySeq
        remaining))))

The LazySeq objects stack up, each one holding a reference to the previous.

With most lazy sequences, the wrapping doesn't matter because they automatically "unwrap" as soon as you request a value. That happens in LazySeq.seq.

This is one case where it does matter, because your loop builds up so many layers of lazy sequence objects that the nested calls to LazySeq.seq and .sval overflow the maximum stack size allowed by the JVM. That's what you see in the stacktrace.

(This also has memory implications, since a reference to the start of the sequence prevents any of the others from being garbage-collected, what Clojure programmers call "holding on to the head" of the sequence.)

The more general issue with this function is mixing lazy (filter) and non-lazy (loop) operations. That's often a source of problems, so Clojure programmers learn to avoid it out of habit.

As Alan suggests, you can avoid the problem by using only non-lazy operations, such as filterv instead of filter, which forces the lazy sequence into a vector.

Almost any style of lazy evaluation can exhibit some variation of this problem. I described it in Clojure don'ts: concat. For another example see foldr versus foldl in Haskell.

Even without laziness, deeply-nested object trees can cause a StackOverflow, for examples in Java I found xstream#88 or circe#1074.

Stuart Sierra
  • 10,837
  • 2
  • 29
  • 35
  • Thank you very much for your detailed explanation, Stuart. I already read your Clojure dont's post as Alan linked it in his answer. I think I understand the problem quite well now. As a future expert Clojure developer I will at least avoid this particular problem ;-) – Mark Schäfer Feb 21 '20 at 06:58
2

Here is a version that works:

(ns tst.demo.core
  (:use tupelo.core tupelo.test))

(defn primes
  "Calculate all primes up to the given number"
  [n]
  (loop [result  []
         numbers (range 2 (inc n))]
    (if (empty? numbers)
      result
      (let [[new-prime & candidate-primes] numbers]
        (recur
          (conj result new-prime)
          (filterv (fn [n] (not= 0 (mod n new-prime)))
            candidate-primes))) )))

(dotest
  (spyx (primes 99999))
  )

with result:

-------------------------------
   Clojure 1.10.1    Java 13
-------------------------------

Testing tst.demo.core
(primes 99999) => [2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 
67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 
167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 
269 271 277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 
379 383 389 397 401 409 419 421 431 433 439 443 449 457 461 463 467 479 
487 491 499 503 509 521 523 541 547 557 563 569 571 577 587 593 599 601 
...<snip>...
99401 99409 99431 99439 99469 99487 99497 99523 99527 99529 99551 99559 
99563 99571 99577 99581 99607 99611 99623 99643 99661 99667 99679 99689 
99707 99709 99713 99719 99721 99733 99761 99767 99787 99793 99809 99817 
99823 99829 99833 99839 99859 99871 99877 99881 99901 99907 99923 99929 
99961 99971 99989 99991]

I renames your variables a bit to make them clearer. If you look closely, you'll see the only substantive difference is the change from the lazy filter to the eager filterv.

Before this change, it worked for a N of 9999 but failed for 99999. I'm not sure about the implementation of the lazy filter function, but that is clearly the problem.


Strange (& unpredictable) problems like this reinforce my dislike of excessive laziness in Clojure code. It appears you have crashed into a variant of the Clojure Don'ts: Concat problem. In this instance, you code looks like:

(filter ...
  (filter ...
    (filter ...
      (filter ...
         ...<many, many more>... ))))

Lazy sequences are implemented as nested function calls. Since the last loop that finds prime 99991 is dependent on the first call that finds prime 2, the earlier lazy sequences (and their nested function calls on the stack) cannot be released and you eventually blow the stack.

On my computer, a simple recursive implementation of factorial(N) blows up around N=4400. The above found 9592 primes, so it the specific cause seems to be a bit more complex than 1 stack frame per prime.

Possibly N=32 chunking could play a part.


In order to avoid bugs due to unnecessary laziness, you may be interested in replacing concat with glue, and replacing for with forv. You can also see the full API docs.

Alan Thompson
  • 29,276
  • 6
  • 41
  • 48
  • Thank you very much for your detailed explanation, Alan. Together with Stuarts answer and his Clojure dont's post, I think I understand the problem quite well now. – Mark Schäfer Feb 21 '20 at 07:02