5

I am trying to understand the features of Spliterator and came across these 2 methods estimatedSize and getExactSizeIfKnown I could figure out what is estimatedSize but not sure exactly what doesgetExactSizeIfKnowndo. Can someone please give an example explaining the difference between the two.

EDIT: I tried the following example in which both of them are the same. In which cases would they be different?

public static void main(String[] args) {
        List<Integer> l = new ArrayList<>();
        l.add(1);
        l.add(2);
        l.add(3);
        Spliterator<Integer> s= (Spliterator<Integer>) l.spliterator();
    Spliterator<Integer> s1=s.trySplit();
    while(s.tryAdvance(n -> {System.out.print(n+" ");System.out.println("estimateSize "+s.estimateSize()+" getexactsizeifknown "+s.getExactSizeIfKnown());})); 
ghostrider
  • 2,046
  • 3
  • 23
  • 46

1 Answers1

5

The estimateSize method:

Returns an estimate of the number of elements that would be encountered by a forEachRemaining(java.util.function.Consumer<? super T>) traversal, or returns Long.MAX_VALUE if infinite, unknown, or too expensive to compute.

If this Spliterator is SIZED and has not yet been partially traversed or split, or this Spliterator is SUBSIZED and has not yet been partially traversed, this estimate must be an accurate count of elements that would be encountered by a complete traversal. Otherwise, this estimate may be arbitrarily inaccurate, but must decrease as specified across invocations of trySplit().

API Note:

Even an inexact estimate is often useful and inexpensive to compute. For example, a sub-spliterator of an approximately balanced binary tree may return a value that estimates the number of elements to be half of that of its parent; if the root Spliterator does not maintain an accurate count, it could estimate size to be the power of two corresponding to its maximum depth.

And the getExactSizeIfKnown method is a:

Convenience method that returns estimateSize() if this Spliterator is SIZED, else -1.

Implementation Requirements:

The default implementation returns the result of estimateSize() if the Spliterator reports a characteristic of SIZED, and -1 otherwise.

Both of those methods reference SIZED, which is a:

Characteristic value signifying that the value returned from estimateSize() prior to traversal or splitting represents a finite size that, in the absence of structural source modification, represents an exact count of the number of elements that would be encountered by a complete traversal.

API Note:

Most Spliterators for Collections, that cover all elements of a Collection report this characteristic. Sub-spliterators, such as those for HashSet, that cover a sub-set of elements and approximate their reported size do not.

Based on all of this, the two methods will only ever return different values if the Spliterator does not have the SIZED characteristic.


In your example, the source of the Spliterator is an ArrayList. If we take a look at the documentation of ArrayList.spliterator():

Creates a late-binding and fail-fast Spliterator over the elements in this list.

The Spliterator reports Spliterator.SIZED, Spliterator.SUBSIZED, and Spliterator.ORDERED. Overriding implementations should document the reporting of additional characteristic values.

Due to the SUBSIZED characteristic, a Spliterator created from an ArrayList—including those resulting from trySplit—will never have estimateSize and getExactSizeIfKnown return different values.

Slaw
  • 37,820
  • 8
  • 53
  • 80
  • Id say a bit different, `ArrayList` is easy :) there are more complicated and interesting examples. – Eugene Mar 30 '19 at 08:41
  • 1
    There may be more interesting examples, but the OP used `ArrayList` :) And even in the more complicated examples, the contract of `getExactSizeIfKnown` is extremely constrained. Either it's `SIZED`, and thus returns `estimateSize`, or it's not, and thus returns `-1`. That means no matter what, there's only one case when they'll return different values. – Slaw Mar 30 '19 at 08:50
  • @shaw thanks for the answer. Can you give one example where it returns -1? – ghostrider Mar 30 '19 at 09:55
  • 2
    @ghostrider The spliterators returned by `HashSet` _after_ a call to `trySplit`. In other words, the sub-spliterators of `HashSet` return `-1`. At least in Java 12. – Slaw Mar 30 '19 at 10:18
  • 2
    @ghostrider staying at your question’s example, `Spliterator s = l.stream().filter(x -> true).spliterator();` does not know its exact size for obvious reasons, but provides an estimate. Slaw’s example also applies from Java 8 to Java 12; as elaborated in [this answer](https://stackoverflow.com/a/44802784/2711488) (in the second half), when splitting the hash map’s internal array using ranges, it depends on the actual distribution of the elements (according to their hash codes), how many elements are in each part, so the size becomes an estimate. – Holger Apr 01 '19 at 16:53