139

What are the differences among Streams, Views (SeqView), and Iterators in scala? This is my understanding:

  • They are all lazy lists.
  • Streams cache the values.
  • Iterators can only be used once? You can't go back to the beginning and evaluate the value again?
  • View's values are not cached but you can evaluate them again and again?

So if I want to save heap space, should I use iterators (if I won't traverse the list again) or views? Thanks.

Leif Wickland
  • 3,693
  • 26
  • 43
JWC
  • 1,745
  • 2
  • 12
  • 14
  • 2
    http://stackoverflow.com/questions/4798043/scala-what-is-the-difference-between-the-methods-iterator-and-view -------------- http://stackoverflow.com/questions/3361478/scala-what-are-views-for-collections-and-when-would-you-want-to-use-them – Gene T Mar 05 '11 at 04:40

1 Answers1

184

First, they are all non-strict. That has a particular mathematical meaning related to functions, but, basically, means they are computed on-demand instead of in advance.

Stream is a lazy list indeed. In fact, in Scala, a Stream is a List whose tail is a lazy val. Once computed, a value stays computed and is reused. Or, as you say, the values are cached.

An Iterator can only be used once because it is a traversal pointer into a collection, and not a collection in itself. What makes it special in Scala is the fact that you can apply transformation such as map and filter and simply get a new Iterator which will only apply these transformations when you ask for the next element.

Scala used to provide iterators which could be reset, but that is very hard to support in a general manner, and they didn't make version 2.8.0.

Views are meant to be viewed much like a database view. It is a series of transformation which one applies to a collection to produce a "virtual" collection. As you said, all transformations are re-applied each time you need to fetch elements from it.

Both Iterator and views have excellent memory characteristics. Stream is nice, but, in Scala, its main benefit is writing infinite sequences (particularly sequences recursively defined). One can avoid keeping all of the Stream in memory, though, by making sure you don't keep a reference to its head (for example, by using def instead of val to define the Stream).

Because of the penalties incurred by views, one should usually force it after applying the transformations, or keep it as a view if only few elements are expected to ever be fetched, compared to the total size of the view.

Daniel C. Sobral
  • 295,120
  • 86
  • 501
  • 681
  • 11
    `Iterator` is also pretty handy for probing the infinite, and I generally prefer them over streams where possible. The real benefit in streams is that previously accessed values are cached, which is a serious boon when trying to implement something like the fibonacci sequence - which is defined in terms of previous values. – Kevin Wright Mar 01 '11 at 20:46
  • 6
    Fibonacci is a less than perfect example as it only needs the last 2 previous values, and keeping the whole stream is a waste. The Ackermann function is probably the canonical example. – Jürgen Strobel Sep 20 '12 at 21:56
  • 5
    @JürgenStrobel Ackermann would result in lousy performance, since indexed access of streams is O(n). But I agree wrt fibonacci. – Daniel C. Sobral Sep 22 '12 at 21:48
  • 10
    Oh right. This makes Stream a poor choice for any caching approach. – Jürgen Strobel Sep 24 '12 at 07:29
  • 1
    Could you please explain more about "(for example, by using def instead of val to define the Stream)." this part? – sasha.sochka Apr 19 '14 at 21:09
  • 1
    scala> val x = {println("x"); 1} x x: Int = 1 scala> x res17: Int = 1 scala> x res18: Int = 1 scala> def x = {println("x"); 1} x: Int scala> x x res19: Int = 1 scala> x x res20: Int = 1 So def does all work again, while val computes once – dveim Jun 28 '14 at 14:39
  • 8
    This answer is super clear, it should be part of the documentation... oh, actually it is! Thanks Daniel http://docs.scala-lang.org/tutorials/FAQ/stream-view-iterator.html – Svend Jul 25 '14 at 12:57
  • Computing primes would be a better application for Stream because 1. all previous values (up to the square root) are required to compute next value (efficiently) and 2. access is most efficient starting from the beginning (since lower values are more common divisors to eliminate candidate primes). – combinatorist Jul 19 '19 at 03:02
  • How about memory footprint? e.g. when you do a series of `map` on view/iterator/stream, do they all create temporary collections in the middle? – SwiftMango Sep 06 '19 at 16:54
  • @texasbruce I mention that towards the end. None of them create temporary collections. Streams will keep whatever you have a reference for, so if you do `val x = Stream.from(1).map(...).take(10)`, only the final result will be kept. If, on the other hand, you have `val x = Stream.from(1); val y = x.map(...).take(10)`, then the first and the last will be kept. – Daniel C. Sobral Sep 17 '19 at 17:43