Why the BufferedReader::lines() creates a Stream through Iterator instead of Spliterator?

Question

I was surprised to see that BufferedReader lines() method creates an instance of Stream<T> with an implementation of the Iterator<T> interface, instead of Spliterator<T>. For many reasons there are several advantages for using a Spliterator<T>, even without parallelism. For instance, Brian Goetz states in its answer to the question Iterator versus Stream of Java 8:

Spliterator has fundamentally lower per-element access costs than Iterator, even sequentially.

So, why the BufferedReader::lines() creates a Stream<T> through Iterator<T> instead of Spliterator<T>?

People use what the are familiar with. There is no reason why JRE developers should be different. — Holger, May 27 '16 at 13:19
IMHO a JRE developer that is developing a new feature closely related with new JDK8 API and specifically the `Stream` should be aware of it. — Miguel Gamboa, May 27 '16 at 14:28
That’s what’s going through my mind as well, especially, when it comes to the new Java 9 methods as nowadays, the advantages of the `Spliterator` should become widely known. Still, I try my best not to judge too fast. We don’t know about the workload the particular developer had to handle the day (s)he implemented such a method… — Holger, May 27 '16 at 14:47

score 4 · Accepted Answer · answered May 27 '16 at 13:53

There is no technical reason to implement it using an Iterator. The statement from Brian Goetz still holds. Why I doubt that you will notice a performance difference in this specific case, the Spliterator based implementation would be much simpler, as all it needs is a tryAdvance method implementation which invokes readLine(), in contrast to the iterator implementation which has to maintain state to remember whether hasNext() has been called already and with which result.

So the actual reason is the same reason, why lots of developers here do it. Iterators are familiar, therefore developers quickly go and implement it, knowing that they can wrap it (in my earlier answers I did it as well). In case of the JRE development, there might be a historical reason, e.g. that it was implemented before the Spliterator was introduced and only refactored afterwards.

Note that there are worse offenders, like String.chars(), which could be implemented as a fast, lightweight, array based spliterator with perfect parallel support. Instead, you’ll get a PrimitiveIterator.OfInt based implementation in Java 8 which is more complicated, does waste performance and intrinsically has poor parallel support (the underlying implementation has to buffer the data).

Thankfully, String.chars() will be fixed in Java 9, which does not imply that every involved developer got the message. I just looked at Matcher.results(), introduced in Java 9, and it also uses the Iterator detour (in contrast to Scanner.findAll, to name a positive counter-example). Of course, this all might change before release.

But unnecessary Iterator detours in Stream producing methods are unlikely to disappear soon. In some cases, it’s not even worth wasting time to rewrite the methods, once they are implemented the way they are…

Why the BufferedReader::lines() creates a Stream through Iterator instead of Spliterator?

1 Answers1