hasDefiniteSize and knownSize

Question

I am going through the List methods in Scala.

val mylist = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 3, 10)

I am quite confused by hasDefiniteSize and knownSize.

For List, hasDefiniteSize returns true and knownSize returns -1.

What is the exact theory behind these methods?

For what kind of list? Could you provide the example? – Duelist Apr 26 '20 at 15:46 — Duelist, Apr 26 '20 at 15:46
val mylist = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 3, 10) – Avenger Apr 26 '20 at 15:53 — Avenger, Apr 26 '20 at 15:53

stefanobaghino · Answer 1 · 2021-02-19T08:43:06.103

This method is defined by a superclass of List which is common with possibly endless collections (like Streams, LazyLists and Iterators).

For more details, I believe the documentation puts it best.

Here is the one for hasDefiniteSize in version 2.13.1:

Tests whether this collection is known to have a finite size. All strict collections are known to have finite size. For a non-strict collection such as Stream, the predicate returns true if all elements have been computed. It returns false if the stream is not yet evaluated to the end. Non-empty Iterators usually return false even if they were created from a collection with a known finite size.

Note: many collection methods will not work on collections of infinite sizes. The typical failure mode is an infinite loop. These methods always attempt a traversal without checking first that hasDefiniteSize returns true. However, checking hasDefiniteSize can provide an assurance that size is well-defined and non-termination is not a concern.

Note that hasDefiniteSize is deprecated with the following message:

(Since version 2.13.0) Check .knownSize instead of .hasDefiniteSize for more actionable information (see scaladoc for details)

The documentation for knownSize further states:

The number of elements in this collection, if it can be cheaply computed, -1 otherwise. Cheaply usually means: Not requiring a collection traversal.

List is an implementation of a linked list, which is why List(1, 2, 3).hasDefiniteSize returns true (the collection is not boundless) but List(1, 2, 3).knownSize returns -1 (computing the collection size requires traversing the whole list).

score 3 · Answer 2 · answered Apr 26 '20 at 16:34

Some collections know their size

Vector(1,2,3).knownSize   // 3

and some do not

List(1,2,3).knownSize     // -1

If a collection knows its size then some operations can be optimised, for example, consider how Iterable#sizeCompare uses knownSize to possibly return early

  def sizeCompare(that: Iterable[_]): Int = {
    val thatKnownSize = that.knownSize

    if (thatKnownSize >= 0) this sizeCompare thatKnownSize
    else {
      val thisKnownSize = this.knownSize

      if (thisKnownSize >= 0) {
        val res = that sizeCompare thisKnownSize
        // can't just invert the result, because `-Int.MinValue == Int.MinValue`
        if (res == Int.MinValue) 1 else -res
      } else {
        val thisIt = this.iterator
        val thatIt = that.iterator
        while (thisIt.hasNext && thatIt.hasNext) {
          thisIt.next()
          thatIt.next()
        }
        java.lang.Boolean.compare(thisIt.hasNext, thatIt.hasNext)
      }
    }
  }

See related question Difference between size and sizeIs

hasDefiniteSize and knownSize

2 Answers2