Spliterator vs Stream.Builder

Question

I read some questions how to create a finite Stream ( Finite generated Stream in Java - how to create one?, How do streams stop?).

The answers suggested to implement a Spliterator. The Spliterator would implement the logic how to and which element to provide as next (tryAdvance). But there are two other non-default methods trySplit and estimateSize() which I would have to implement.

The JavaDoc of Spliterator says:

An object for traversing and partitioning elements of a source. The source of elements covered by a Spliterator could be, for example, an array, a Collection, an IO channel, or a generator function. ... The Spliterator API was designed to support efficient parallel traversal in addition to sequential traversal, by supporting decomposition as well as single-element iteration. ...

On the other hand I could implement the logic how to advance to the next element around a Stream.Builder and bypass a Spliterator. On every advance I would call accept or add and at the end build. So it looks quite simple.

What does the JavaDoc say?

A mutable builder for a Stream. This allows the creation of a Stream by generating elements individually and adding them to the Builder (without the copying overhead that comes from using an ArrayList as a temporary buffer.)

Using StreamSupport.stream I can use a Spliterator to obtain a Stream. And also a Builder will provide a Stream.

When should / could I use a Stream.Builder?
Only if a Spliterator wouldn't be more efficient (for instance because the source cannot be partitioned and its size cannot be estimated)?

I just want to add that while Ive written a few Spliterators on my own, its never a fun story, it might be for Holger that is an ace at this; but if you really need advance features, Spliterator is the way. — Eugene, Sep 14 '18 at 18:51

score 7 · Accepted Answer · answered Sep 14 '18 at 17:23

7

Note that you can extend Spliterators.AbstractSpliterator. Then, there is only tryAdvance to implement.

So the complexity of implementing a Spliterator is not higher.

The fundamental difference is that a Spliterator’s tryAdvance method is only invoked when a new element is needed. In contrast, the Stream.Builder has a storage which will be filled with all stream elements, before you can acquire a Stream.

So a Spliterator is the first choice for all kinds of lazy evaluations, as well as when you have an existing storage you want to traverse, to avoid copying the data.

The builder is the first choice when the creation of the elements is non-uniform, so you can’t express the creation of an element on demand. Think of situations where you would otherwise use Stream.of(…), but it turns out to be to inflexible.

E.g. you have Stream.of(a, b, c, d, e), but now it turns out, c and d are optional. So the solution is

Stream.Builder<MyType> builder = Stream.builder();
builder.add(a).add(b);
if(someCondition) builder.add(c).add(d);
builder.add(e).build()
   /* stream operations */

Other use cases are this answer, where a Consumer was needed to query an existing spliterator and push the value back to a Stream afterwards, or this answer, where a structure without random access (a class hierarchy) should be streamed in the opposite order.

answered Sep 14 '18 at 17:23

Holger

285,553
42
434
765

1

I think lazy evaluation is a very distinctive and important point. The example for `Stream.Builder` where `c` and `d` are optional is (if I got you right) similar to this [question](https://stackoverflow.com/questions/52324648/remove-duplicates-from-a-large-unsorted-array-and-maintain-the-order). I was about to use a `Stream.Builder` to provide a solution but wasn't sure if it's appropriate. – LuCio Sep 14 '18 at 17:37
I'm dumb and I'm very new to java and I'm seeing Stream.builder() and build() functions, Please help me understanding!!! – nitinsridar Sep 01 '19 at 16:14
@nitinsridar what exactly is your question? – Holger Sep 02 '19 at 07:43
What is buider? I'm seeing in elasticsearch, apache kafka but I'm not getting what builder actually doing ? @Holger – nitinsridar Sep 02 '19 at 08:32
@nitinsridar method [`Stream.builder()`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/stream/Stream.html#builder()), which returns an instance of [class `Stream.Builder`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/stream/Stream.Builder.html). – Holger Sep 02 '19 at 08:35

score 3 · Answer 2 · answered Sep 14 '18 at 17:00

3

On the other hand I could implement the logic how to advance to the next element around a Stream.Builder and bypass a Spliterator. On every advance I would call accept or add and at the end build. So it looks quite simple.

Yes and no. It is simple, but I don't think you understand the usage model:

A stream builder has a lifecycle, which starts in a building phase, during which elements can be added, and then transitions to a built phase, after which elements may not be added. The built phase begins when the build() method is called, which creates an ordered Stream whose elements are the elements that were added to the stream builder, in the order they were added.

(Javadocs)

In particular no, you would not invoke a Stream.Builder's accept or add method on any stream advance. You need to provide all the objects for the stream in advance. Then you build() to get a stream that will provide all the objects you previously added. This is analogous to adding all the objects to a List, and then invoking that List's stream() method.

If that serves your purposes and you can in fact do it efficiently then great! But if you need to generate elements on an as-needed basis, whether with or without limit, then Stream.Builder cannot help you. Spliterator can.

answered Sep 14 '18 at 17:00

John Bollinger

160,171
8
81
157

_you would not invoke a Stream.Builder's accept or add method on any stream advance_ - I think, you got me wrong. I said, I could build the logic around a `Stream.Builder.accept`, not a `Stream` advance. I meant that I could thus add every element to a `Stream.Builder` similar like a `Spliterator.tryAdvance`provides every element. – LuCio Sep 14 '18 at 17:23
2

@LuCio `tryAdvance` *potentially* provides every element, but might not get invoked for every element, as the stream might not need them. But you have to invoke `Stream.Builder.accept` or `add` for every element before you can create the stream, without knowing whether all of them are needed. Calling a method for every element or implementing a method potentially providing every element, are a fundamentally different code structures. Usually, there’s a natural choice. – Holger Sep 14 '18 at 17:27
@Holger Yes. This is an important difference which I just read from your answer. – LuCio Sep 14 '18 at 17:41
@LuCio, what you wrote was "On every advance I would call `accept` or `add` and at the end `build`. " And yes, of course you can build a stream using `Stream.Builder.advance` and ultimately `Stream.Builder.build`. The question is timing. As I already said, if you can feed all the elements to a `Stream.Builder` up front, before you start processing the stream, then that's fine. But you cannot invoke `accept` or `add` in concert with advancing through the stream. If that's not what you meant to say anyway, then well and good. – John Bollinger Sep 14 '18 at 18:53
@JohnBollinger Thx for clarification. – LuCio Sep 14 '18 at 22:09

score 0 · Answer 3 · answered Oct 04 '21 at 15:35

The Stream.Builder is a misnomer as streams can't really be built. Things that can be built are value objects - dto, array, collection. So if Stream.Builder is instead thought of as a buffer, it might help understand it better, eg:

buffer.add(a)
buffer.add(b)
buffer.stream()

This shows how similar it is to an ArrayList:

list.add(a)
list.add(b)
list.stream()

On the other hand, Spliterator is the basis of a stream and allows for efficient navigation over data sets (improved version of the Iterator).

So the answer is they should not be compared. Comparing Stream.Builder to Spliterator is the same as comparing ArrayList to Spliterator.

Spliterator vs Stream.Builder

3 Answers3