When stress-testing some Clojure code at work, I noticed it runs out of heap space when iterating over large data-sets. I eventually managed to trace the issues back to the combination of Clojure's doseq
function, and implementation fo lazy sequences.
This is the minimal code snippet that crashes Clojure by exhausting available heap space:
(doseq [e (take 1000000000 (iterate inc 1))] (identity e))
The documentation for doseq
clearly states that it doesn't retain the head of the lazy sequence, so I would expect the memory complexity of the above code to be close to O(1). Is there something I'm missing? What's the Clojure-idiomatic way of iterating over extremely large lazy sequences, if doseq
isn't up to the job?