I am comfortable with streams, but I admit I am puzzled by this behavior:
import collection.immutable.Stream
object StreamForceTest extends App {
println("Computing fibs")
val fibs: Stream[BigInt] = BigInt(0) #:: BigInt(1) #::
fibs.zip(fibs.tail).map((x: (BigInt, BigInt)) => {
println("Adding " + x._1 + " and " + x._2);
x._1 + x._2
})
println("Taking first 5 elements")
val fibs5 = fibs.take(5)
println("Computing length of that prefix")
println("fibs5.length = " + fibs5.length)
}
with output
Computing fibs
Taking first 5 elements
Computing length of that prefix
Adding 0 and 1
Adding 1 and 1
Adding 1 and 2
fibs5.length = 5
Why should take(5)
not force the stream's values to be computed,
while length
does do so? Offhand neither one needs to actually
look at the values, but I would have thought that take
was more
likely to do it than length
. Inspecting the source code on github,
we find these definitions for take
(including an illuminating
comment):
override def take(n: Int): Stream[A] = (
// Note that the n == 1 condition appears redundant but is not.
// It prevents "tail" from being referenced (and its head being evaluated)
// when obtaining the last element of the result. Such are the challenges
// of working with a lazy-but-not-really sequence.
if (n <= 0 || isEmpty) Stream.empty
else if (n == 1) cons(head, Stream.empty)
else cons(head, tail take n-1)
)
and length
:
override def length: Int = {
var len = 0
var left = this
while (!left.isEmpty) {
len += 1
left = left.tail
}
len
}
The definition of head
and tail
is obtained from the specific
subclass (Empty
and Cons
). (Of course Empty
is an object, not a
class, and its definitions of head
and tail
just throw
exceptions.) There are subtleties, but they seem to concern making
sure that the tail
of a Cons
is evaluated lazily; the head
definition is straight out of lecture 0 on Scala constructors.
Note that length
doesn't go near head
, but it's the one that
does the forcing.
All this is part of a general puzzlement about how close Scala streams are to Haskell lists. I thought Haskell treated head and tail symmetrically (I'm not a serious Haskell hacker), and Scala forced head evaluation in more circumstances. I'm trying to figure out exactly what those circumstances are.