148

How can I check if a Stream is empty and throw an exception if it's not, as a non-terminal operation?

Basically, I'm looking for something equivalent to the code below, but without materializing the stream in-between. In particular, the check should not occur before the stream is actually consumed by a terminal operation.

public Stream<Thing> getFilteredThings() {
    Stream<Thing> stream = getThings().stream()
                .filter(Thing::isFoo)
                .filter(Thing::isBar);
    return nonEmptyStream(stream, () -> {
        throw new RuntimeException("No foo bar things available")   
    });
}

private static <T> Stream<T> nonEmptyStream(Stream<T> stream, Supplier<T> defaultValue) {
    List<T> list = stream.collect(Collectors.toList());
    if (list.isEmpty()) list.add(defaultValue.get());
    return list.stream();
}
Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
Cephalopod
  • 14,632
  • 7
  • 51
  • 70
  • 31
    You can't have your cake and eat it too--and quite literally so, in this context. You have to *consume* the stream to find out if it's empty. That's the point of Stream's semantics (laziness). – Marko Topolnik Oct 30 '14 at 09:21
  • It will be consumed eventually, at this point the check should occur – Cephalopod Oct 30 '14 at 09:22
  • 14
    To check that the stream is not empty you have to attempt to consume at least one element. At that point the stream has lost its "virginity" and cannot be consumed again from the start. – Marko Topolnik Oct 30 '14 at 09:27
  • @MarkoTopolnik just because it's lazy doesn't mean it couldn't in principle buffer and reemit the element that was peeked on. See vavr stream – Coderino Javarino Jan 24 '22 at 10:49

8 Answers8

105

This may be sufficient in many cases

stream.findAny().isPresent()
kenglxn
  • 1,890
  • 2
  • 14
  • 10
  • 11
    Simple and crisp solution. This code will consume the stream, so we will have to make another stream if we want to iterate when it is not empty. – Harish Mar 05 '21 at 20:28
  • 3
    If you happen to have a last `filter()` operation before needing to `count()`. you can replace the sequence `...filter(expr.findAny().isPresent());` by `...anyMatch(expr)` – juanmf May 04 '22 at 01:14
  • Nice idea, @juanmf. In the simple case without filter one may even do `stream.anyMatch(true)` if one doesn’t find it too cryptic. – Ole V.V. Nov 11 '22 at 10:07
43

The other answers and comments are correct in that to examine the contents of a stream, one must add a terminal operation, thereby "consuming" the stream. However, one can do this and turn the result back into a stream, without buffering up the entire contents of the stream. Here are a couple examples:

static <T> Stream<T> throwIfEmpty(Stream<T> stream) {
    Iterator<T> iterator = stream.iterator();
    if (iterator.hasNext()) {
        return StreamSupport.stream(Spliterators.spliteratorUnknownSize(iterator, 0), false);
    } else {
        throw new NoSuchElementException("empty stream");
    }
}

static <T> Stream<T> defaultIfEmpty(Stream<T> stream, Supplier<T> supplier) {
    Iterator<T> iterator = stream.iterator();
    if (iterator.hasNext()) {
        return StreamSupport.stream(Spliterators.spliteratorUnknownSize(iterator, 0), false);
    } else {
        return Stream.of(supplier.get());
    }
}

Basically turn the stream into an Iterator in order to call hasNext() on it, and if true, turn the Iterator back into a Stream. This is inefficient in that all subsequent operations on the stream will go through the Iterator's hasNext() and next() methods, which also implies that the stream is effectively processed sequentially (even if it's later turned parallel). However, this does allow you to test the stream without buffering up all of its elements.

There is probably a way to do this using a Spliterator instead of an Iterator. This potentially allows the returned stream to have the same characteristics as the input stream, including running in parallel.

Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
  • 1
    I don’t think that there is a maintainable solution that would support efficient parallel processing as it’s hard to support splitting, however having `estimatedSize` and `characteristics` might even improve single-threaded performance. It just happened that I wrote the `Spliterator` solution while you were posting the `Iterator` solution… – Holger Oct 30 '14 at 17:42
  • 3
    You can ask the stream for a Spliterator, call tryAdvance(lambda) where your lambda captures anything passed to it, and then return a Spliterator that delegates almost everything to the underlying Spliterator, except that it glues the first element back onto the first chunk (and fixes up the result of estimateSize). – Brian Goetz Oct 30 '14 at 20:55
  • 1
    @BrianGoetz Yes, that was my thought, I just haven't yet bothered to go through the leg work of handling all those details. – Stuart Marks Oct 30 '14 at 21:19
  • 3
    @Brian Goetz: That’s what I meant with “too complicated”. Calling `tryAdvance` before the `Stream` does it turns the lazy nature of the `Stream` into a “partially lazy” stream. It also implies that searching for the first element is not a parallel operation anymore as you have to split first and do `tryAdvance` on the split parts concurrently to do a real parallel operation, as far as I understood. If the sole terminal operation is `findAny` or similar that would destroy the entire `parallel()` request. – Holger Oct 31 '14 at 10:41
  • 2
    So for full parallel support you must not call `tryAdvance` before the stream does and have to wrap every split part into a proxy and gather the “hasAny” information of all concurrent operations on your own and ensure that the last concurrent operation throws the desired exception if the stream was empty. Lots of stuff… – Holger Oct 31 '14 at 10:45
  • Nice. In my case it was fine to process the elements of the `Iterator` (in a `do`-`while`) without converting back to `Stream` in the non-empty case. – Ole V.V. Nov 11 '22 at 10:13
31

If you can live with limited parallel capablilities, the following solution will work:

private static <T> Stream<T> nonEmptyStream(
    Stream<T> stream, Supplier<RuntimeException> e) {

    Spliterator<T> it=stream.spliterator();
    return StreamSupport.stream(new Spliterator<T>() {
        boolean seen;
        public boolean tryAdvance(Consumer<? super T> action) {
            boolean r=it.tryAdvance(action);
            if(!seen && !r) throw e.get();
            seen=true;
            return r;
        }
        public Spliterator<T> trySplit() { return null; }
        public long estimateSize() { return it.estimateSize(); }
        public int characteristics() { return it.characteristics(); }
    }, false);
}

Here is some example code using it:

List<String> l=Arrays.asList("hello", "world");
nonEmptyStream(l.stream(), ()->new RuntimeException("No strings available"))
  .forEach(System.out::println);
nonEmptyStream(l.stream().filter(s->s.startsWith("x")),
               ()->new RuntimeException("No strings available"))
  .forEach(System.out::println);

The problem with (efficient) parallel execution is that supporting splitting of the Spliterator requires a thread-safe way to notice whether either of the fragments has seen any value in a thread-safe manner. Then the last of the fragments executing tryAdvance has to realize that it is the last one (and it also couldn’t advance) to throw the appropriate exception. So I didn’t add support for splitting here.

Holger
  • 285,553
  • 42
  • 434
  • 765
19

You must perform a terminal operation on the Stream in order for any of the filters to be applied. Therefore you can't know if it will be empty until you consume it.

Best you can do is terminate the Stream with a findAny() terminal operation, which will stop when it finds any element, but if there are none, it will have to iterate over all the input list to find that out.

This would only help you if the input list has many elements, and one of the first few passes the filters, since only a small subset of the list would have to be consumed before you know the Stream is not empty.

Of course you'll still have to create a new Stream in order to produce the output list.

Eran
  • 387,369
  • 54
  • 702
  • 768
9

I think should be enough to map a boolean

In code this is:

boolean isEmpty = anyCollection.stream()
    .filter(p -> someFilter(p)) // Add my filter
    .map(p -> Boolean.TRUE) // For each element after filter, map to a TRUE
    .findAny() // Get any TRUE
    .orElse(Boolean.FALSE); // If there is no match return false
Luis Roberto
  • 375
  • 4
  • 10
5

Following Stuart's idea, this could be done with a Spliterator like this:

static <T> Stream<T> defaultIfEmpty(Stream<T> stream, Stream<T> defaultStream) {
    final Spliterator<T> spliterator = stream.spliterator();
    final AtomicReference<T> reference = new AtomicReference<>();
    if (spliterator.tryAdvance(reference::set)) {
        return Stream.concat(Stream.of(reference.get()), StreamSupport.stream(spliterator, stream.isParallel()));
    } else {
        return defaultStream;
    }
}

I think this works with parallel Streams as the stream.spliterator() operation will terminate the stream, and then rebuild it as required

In my use-case I needed a default Stream rather than a default value. that's quite easy to change if this is not what you need

phoenix7360
  • 2,807
  • 6
  • 30
  • 41
  • I can't figure out whether this would significantly impact performance with parallel streams. Should probably test it if this is a requirement – phoenix7360 Jul 17 '17 at 10:11
  • Sorry didn't realise that @Holger also had a solution with `Spliterator` I wonder how the two compare. – phoenix7360 Jul 17 '17 at 10:14
0

I would simply use:

stream.count()>0
daniel sp
  • 937
  • 1
  • 11
  • 29
  • 2
    This only works, if you do not have to process the stream. When you want to process the elements, it does not work, because after count() the stream is consumed. count is a terminal operation. – Michael Jan 04 '21 at 16:00
  • 6
    This also unnecessarily consumes and counts the __whole__ stream, where a "lazy" (short-circuiting) `.findAny()` would only process a single item and finish. – Alex Shesterov Sep 09 '21 at 04:25
  • How this is better than `Optional.isPresent()` ? This is much worse than the other solutions! – Diablo Jun 17 '22 at 07:36
0

The best simple solution I could find that does not consume the stream or convert to iterators is:

public Stream<Thing> getFilteredThings() {
    AtomicBoolean found = new AtomicBoolean(false);
    Stream<Thing> stream = getThings().stream()
        .filter(Thing::isFoo)
        .filter(Thing::isBar)
        .forEach(x -> {
             found.set(true);
             // do useful things
         })
    ;
    if (!found.get()) {
        throw new RuntimeException("No foo bar things available");
    }
}

Feel free to suggest improvements..

AmanicA
  • 4,659
  • 1
  • 34
  • 49