Most efficient way to get the last element of a stream

Question

Stream doesn't have a last() method:

Stream<T> stream;
T last = stream.last(); // No such method

What's the most elegant and/or efficient way to get the last element (or null for an empty Stream)?

If you need to find the last element of a `Stream`, you may want to reconsider your design and if you really want to be using a `Stream`. `Stream`s are not necessarily ordered or finite. If your `Stream` is unordered, infinite, or both, the last element has no meaning. In my mind, the point of a `Stream` is to provide a layer of abstraction between data and how you process it. As such, a `Stream` itself does not need to know anything about the relative ordering of its elements. Finding the last element in a `Stream` is O(n). If you had a different data structure, it could be O(1). — Jeffrey, Dec 21 '14 at 00:39
@jeff the need was real: the situation was roughly adding items to a shopping cart, each addition returned error info (certain combinations of items were not valid), but only the last addition's error info (when all items had been added and a fair assessment of the cart could be done) was the info needed. (Yes, the API we are using is broken and can not be fixed). — Bohemian, Dec 21 '14 at 01:35
@BrianGoetz: Infinite streams don't have a well-defined `count()` either, but Stream still has a `count()` method. Really, that argument applies to any non-short-circuiting terminal operation on infinite streams. — Jeffrey Bosboom, Dec 24 '14 at 00:11
@BrianGoetz I think streams should have `last()` method. There could be a survey on April 1st how it should be defined for infinite streams. I would propose: "It never returns and it utilizes at least one processor core at 100%. On parallel streams it is required to utilize all cores at 100%." — Vojta, Apr 13 '17 at 19:47
If the list contains objects with a natural order or that can be ordered you can use the `max()` method as in `stream()...max(Comparator...)`. — Erk, Aug 05 '19 at 23:19
@erk sure, but that’s less efficient than any solution on this page, and not directly relevant to the question (which does not include elements being Comparable) — Bohemian, Aug 05 '19 at 23:24

Bohemian · Accepted Answer · 2015-12-09T15:50:14.223

132

Do a reduction that simply returns the current value:

Stream<T> stream;
T last = stream.reduce((a, b) -> b).orElse(null);

edited Dec 09 '15 at 15:50

answered Dec 18 '14 at 13:18

Bohemian

412,405
93
575
722

2

Would you say this was elegant, efficient or both? – Duncan Jones Dec 18 '14 at 13:24
1

@Duncan I think it's both, but I'm not yet a gun in java 8 and this need came up at work the other day - a junior pushed the stream onto a stack then popped it, and I thought this looked better, but there could be something even simpler out there. – Bohemian Dec 18 '14 at 13:35
22

For simplicity and elegance, this answer wins. And its reasonably efficient in the general case; it will parallelize reasonably well. For some stream sources that know their size, there's a faster way, but in most cases its not worth the extra code to save those few iterations. – Brian Goetz Dec 23 '14 at 15:14
1

@BrianGoetz how this will parallelize well? the last value will be unpredictable using a parallel stream – benez Feb 22 '17 at 01:45
1

@kewlbfy No, if the stream has a defined encounter order, parallel reduce will respect it. If the stream source splits cleanly, you'll get to the answer in _O(lg n)_ time. – Brian Goetz Apr 13 '17 at 19:51
2

@BrianGoetz: it’s still `O(n)`, even if divided by the number of CPU cores. Since the stream doesn’t know what the reduction function does, it still has to evaluate it for every element. – Holger Nov 13 '17 at 08:37

Holger · Answer 2 · 2014-12-18T15:14:36.063

37

This heavily depends on the nature of the Stream. Keep in mind that “simple” doesn’t necessarily mean “efficient”. If you suspect the stream to be very large, carrying heavy operations or having a source which knows the size in advance, the following might be substantially more efficient than the simple solution:

static <T> T getLast(Stream<T> stream) {
    Spliterator<T> sp=stream.spliterator();
    if(sp.hasCharacteristics(Spliterator.SIZED|Spliterator.SUBSIZED)) {
        for(;;) {
            Spliterator<T> part=sp.trySplit();
            if(part==null) break;
            if(sp.getExactSizeIfKnown()==0) {
                sp=part;
                break;
            }
        }
    }
    T value=null;
    for(Iterator<T> it=recursive(sp); it.hasNext(); )
        value=it.next();
    return value;
}

private static <T> Iterator<T> recursive(Spliterator<T> sp) {
    Spliterator<T> prev=sp.trySplit();
    if(prev==null) return Spliterators.iterator(sp);
    Iterator<T> it=recursive(sp);
    if(it!=null && it.hasNext()) return it;
    return recursive(prev);
}

You may illustrate the difference with the following example:

String s=getLast(
    IntStream.range(0, 10_000_000).mapToObj(i-> {
        System.out.println("potential heavy operation on "+i);
        return String.valueOf(i);
    }).parallel()
);
System.out.println(s);

It will print:

potential heavy operation on 9999999
9999999

In other words, it did not perform the operation on the first 9999999 elements but only on the last one.

edited Dec 18 '14 at 15:14

answered Dec 18 '14 at 13:57

Holger

285,553
42
434
765

1

What is the point of the `hasCharacteristics()` block? What value does it add that is not already covered by the `recursive()` method? The latter already navigates to the last split point. Furthermore, `recursive()` can never return `null` so you can remove the `it != null` check. – Gili Apr 10 '15 at 04:17
1

The recursive op may handle every case but is only a fall-back as it has a worse case of a recursion depth matching the number of (unfiltered!) elements. The ideal case is a `SUBSIZED` stream that can guaranty non-empty split halfs so we never need to go back to the left side. Note that in that case `recursive` won’t actually recurse as the `trySplit` has already proven to return `null`. – Holger Apr 10 '15 at 08:41
2

Of course, the code could have been written differently, and it was; I guess the `null`-check stems from an earlier version but then I discovered that for non-`SUBSIZED` streams you have to deal with possible empty split parts, i.e. you have to iterate to find out whether it has values, therefore I moved the `Spliterators.iterator(…)` call into a `recursive` method to be able to back up to the left side if the right side is empty. The loop is still the preferred operation. – Holger Apr 10 '15 at 08:44
2

Interesting solution. Note that according to current Stream API implementation your stream must be either parallel or directly connected to the source spliterator. Otherwise it will for some reason refuse to split even if underlying source spliterator splits. On the other hand you cannot blindly use `parallel()` as this may actually perform some operations (like sorting) in parallel unexpectedly consuming more CPU cores. – Tagir Valeev Sep 21 '15 at 08:48
2

@Tagir Valeev: right, the example code uses `.parallel()`, but indeed, it can have an effect on `sorted()` or `distinct()`. I don’t think, that there should be an effect for any of the other intermediate operations… – Holger Sep 21 '15 at 09:36
@Holger the stream implementation is quite interesting sometimes, if I replace your example with `IntStream.range(0, 10_000_000).filter(x -> x > 0).mapToObj(i -> {...`, i.e.: adding that `filter`, but removing `parallel`, the resulting `Spliterator` is treated as un-splitable, not sure why that is like that though – Eugene Apr 02 '19 at 19:49
1

@Eugene that’s what Tagir Valeev already mentioned in his comment; apparently, the Stream implementation has two different `Spliterator` implementations and the one returned from a sequential stream doesn’t support splitting, though I can’t imagine that this has a significant advantage for Streams composed of stateless operations only. – Holger Apr 03 '19 at 07:05

score 8 · Answer 3 · answered Mar 02 '18 at 08:39

8

Guava has Streams.findLast:

Stream<T> stream;
T last = Streams.findLast(stream);

answered Mar 02 '18 at 08:39

Robert Važan

3,399
2
25
31

1

And it performs much better than `reduce((a, b) -> b)` because it uses `Spliterator.trySplit` internally – ZhekaKozlov Jun 15 '19 at 08:30

score 6 · Answer 4 · edited May 23 '17 at 12:02

This is just a refactoring of Holger's answer because the code, while fantastic, is a bit hard to read/understand, especially for people who were not C programmers before Java. Hopefully my refactored example class is a little easier to follow for those who are not familiar with spliterators, what they do, or how they work.

public class LastElementFinderExample {
    public static void main(String[] args){
        String s = getLast(
            LongStream.range(0, 10_000_000_000L).mapToObj(i-> {
                System.out.println("potential heavy operation on "+i);
                return String.valueOf(i);
            }).parallel()
        );
        System.out.println(s);
    }

    public static <T> T getLast(Stream<T> stream){
        Spliterator<T> sp = stream.spliterator();
        if(isSized(sp)) {
            sp = getLastSplit(sp);
        }
        return getIteratorLastValue(getLastIterator(sp));
    }

    private static boolean isSized(Spliterator<?> sp){
        return sp.hasCharacteristics(Spliterator.SIZED|Spliterator.SUBSIZED);
    }

    private static <T> Spliterator<T> getLastSplit(Spliterator<T> sp){
        return splitUntil(sp, s->s.getExactSizeIfKnown() == 0);
    }

    private static <T> Iterator<T> getLastIterator(Spliterator<T> sp) {
        return Spliterators.iterator(splitUntil(sp, null));
    }

    private static <T> T getIteratorLastValue(Iterator<T> it){
        T result = null;
        while (it.hasNext()){
            result = it.next();
        }
        return result;
    }

    private static <T> Spliterator<T> splitUntil(Spliterator<T> sp, Predicate<Spliterator<T>> condition){
        Spliterator<T> result = sp;
        for (Spliterator<T> part = sp.trySplit(); part != null; part = result.trySplit()){
            if (condition == null || condition.test(result)){
                result = part;
            }
        }
        return result;      
    }   
}

panagdu · Answer 5 · 2014-12-18T16:27:48.227

1

Here is another solution (not that efficient):

List<String> list = Arrays.asList("abc","ab","cc");
long count = list.stream().count();
list.stream().skip(count-1).findFirst().ifPresent(System.out::println);

edited Dec 18 '14 at 16:27

answered Dec 18 '14 at 15:54

panagdu

2,133
1
21
36

Interesting... Did you test run this? Because there is no `substream` method, and even if there were this wouldn't work because `count` is a terminal operation. So what is the story behind this? – Lii Dec 18 '14 at 16:00
Strange, I don't know what jdk I have but it does have a substream. I have look at the oficial javadoc(http://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html) and you are right, it doesn't appear here. – panagdu Dec 18 '14 at 16:10
6

Of course, you will have to check whether `count==0` first as `Stream.skip` doesn’t like `-1` as input. Besides that, the question didn’t say that you can acquire the `Stream` twice. Nor did it say that acquiring the `Stream` twice is guaranteed to get the same number of elements. – Holger Dec 18 '14 at 17:01

Tet · Answer 6 · 2017-10-09T17:09:58.750

Parallel unsized streams with 'skip' methods are tricky and the @Holger's implementation gives a wrong answer. Also @Holger's implementation is a bit slower because it uses iterators.

An optimisation of @Holger answer:

public static <T> Optional<T> last(Stream<? extends T> stream) {
    Objects.requireNonNull(stream, "stream");

    Spliterator<? extends T> spliterator = stream.spliterator();
    Spliterator<? extends T> lastSpliterator = spliterator;

    // Note that this method does not work very well with:
    // unsized parallel streams when used with skip methods.
    // on that cases it will answer Optional.empty.

    // Find the last spliterator with estimate size
    // Meaningfull only on unsized parallel streams
    if(spliterator.estimateSize() == Long.MAX_VALUE) {
        for (Spliterator<? extends T> prev = spliterator.trySplit(); prev != null; prev = spliterator.trySplit()) {
            lastSpliterator = prev;
        }
    }

    // Find the last spliterator on sized streams
    // Meaningfull only on parallel streams (note that unsized was transformed in sized)
    for (Spliterator<? extends T> prev = lastSpliterator.trySplit(); prev != null; prev = lastSpliterator.trySplit()) {
        if (lastSpliterator.estimateSize() == 0) {
            lastSpliterator = prev;
            break;
        }
    }

    // Find the last element of the last spliterator
    // Parallel streams only performs operation on one element
    AtomicReference<T> last = new AtomicReference<>();
    lastSpliterator.forEachRemaining(last::set);

    return Optional.ofNullable(last.get());
}

Unit testing using junit 5:

@Test
@DisplayName("last sequential sized")
void last_sequential_sized() throws Exception {
    long expected = 10_000_000L;
    AtomicLong count = new AtomicLong();
    Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed();
    stream = stream.skip(50_000).peek(num -> count.getAndIncrement());

    assertThat(Streams.last(stream)).hasValue(expected);
    assertThat(count).hasValue(9_950_000L);
}

@Test
@DisplayName("last sequential unsized")
void last_sequential_unsized() throws Exception {
    long expected = 10_000_000L;
    AtomicLong count = new AtomicLong();
    Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed();
    stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
    stream = stream.skip(50_000).peek(num -> count.getAndIncrement());

    assertThat(Streams.last(stream)).hasValue(expected);
    assertThat(count).hasValue(9_950_000L);
}

@Test
@DisplayName("last parallel sized")
void last_parallel_sized() throws Exception {
    long expected = 10_000_000L;
    AtomicLong count = new AtomicLong();
    Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
    stream = stream.skip(50_000).peek(num -> count.getAndIncrement());

    assertThat(Streams.last(stream)).hasValue(expected);
    assertThat(count).hasValue(1);
}

@Test
@DisplayName("getLast parallel unsized")
void last_parallel_unsized() throws Exception {
    long expected = 10_000_000L;
    AtomicLong count = new AtomicLong();
    Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
    stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
    stream = stream.peek(num -> count.getAndIncrement());

    assertThat(Streams.last(stream)).hasValue(expected);
    assertThat(count).hasValue(1);
}

@Test
@DisplayName("last parallel unsized with skip")
void last_parallel_unsized_with_skip() throws Exception {
    long expected = 10_000_000L;
    AtomicLong count = new AtomicLong();
    Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
    stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
    stream = stream.skip(50_000).peek(num -> count.getAndIncrement());

    // Unfortunately unsized parallel streams does not work very well with skip
    //assertThat(Streams.last(stream)).hasValue(expected);
    //assertThat(count).hasValue(1);

    // @Holger implementation gives wrong answer!!
    //assertThat(Streams.getLast(stream)).hasValue(9_950_000L); //!!!
    //assertThat(count).hasValue(1);

    // This is also not a very good answer better
    assertThat(Streams.last(stream)).isEmpty();
    assertThat(count).hasValue(0);
}

The only solution to support both's scenarios is to avoid detecting the last spliterator on unsized parallel streams. The consequence is that the solution will perform operations on all elements but it will give always the right answer.

Note that in sequential streams, it will anyway perform operations on all elements.

public static <T> Optional<T> last(Stream<? extends T> stream) {
    Objects.requireNonNull(stream, "stream");

    Spliterator<? extends T> spliterator = stream.spliterator();

    // Find the last spliterator with estimate size (sized parallel streams)
    if(spliterator.hasCharacteristics(Spliterator.SIZED|Spliterator.SUBSIZED)) {
        // Find the last spliterator on sized streams (parallel streams)
        for (Spliterator<? extends T> prev = spliterator.trySplit(); prev != null; prev = spliterator.trySplit()) {
            if (spliterator.getExactSizeIfKnown() == 0) {
                spliterator = prev;
                break;
            }
        }
    }

    // Find the last element of the spliterator
    //AtomicReference<T> last = new AtomicReference<>();
    //spliterator.forEachRemaining(last::set);

    //return Optional.ofNullable(last.get());

    // A better one that supports native parallel streams
    return (Optional<T>) StreamSupport.stream(spliterator, stream.isParallel())
            .reduce((a, b) -> b);
}

With regard to the unit testing for that implementation, the first three tests are exactly the same (sequential & sized parallel). The tests for unsized parallel are here:

@Test
@DisplayName("last parallel unsized")
void last_parallel_unsized() throws Exception {
    long expected = 10_000_000L;
    AtomicLong count = new AtomicLong();
    Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
    stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
    stream = stream.peek(num -> count.getAndIncrement());

    assertThat(Streams.last(stream)).hasValue(expected);
    assertThat(count).hasValue(10_000_000L);
}

@Test
@DisplayName("last parallel unsized with skip")
void last_parallel_unsized_with_skip() throws Exception {
    long expected = 10_000_000L;
    AtomicLong count = new AtomicLong();
    Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
    stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
    stream = stream.skip(50_000).peek(num -> count.getAndIncrement());

    assertThat(Streams.last(stream)).hasValue(expected);
    assertThat(count).hasValue(9_950_000L);
}

Note that unit tests are using assertj library for better fluency. — Tet, Oct 04 '17 at 10:34
The problem is that you are doing `StreamSupport.stream(((Iterable) stream::iterator).spliterator(), stream.isParallel())`, going through an `Iterable` detour that has no characteristics at all, in other words creates an *unordered* stream. Thus, the result has nothing to do with *parallel* or using `skip`, but just with the fact that “last” has no meaning for an unordered stream, so any element is a valid result. — Holger, Oct 09 '17 at 20:37

score 1 · Answer 7 · answered Aug 02 '19 at 19:44

We needed last of a Stream in production - I am still not sure we really did, but various team members on my team said we did because of various "reasons". I ended up writing something like this:

 private static class Holder<T> implements Consumer<T> {

    T t = null;
    // needed to null elements that could be valid
    boolean set = false;

    @Override
    public void accept(T t) {
        this.t = t;
        set = true;
    }
}

/**
 * when a Stream is SUBSIZED, it means that all children (direct or not) are also SIZED and SUBSIZED;
 * meaning we know their size "always" no matter how many splits are there from the initial one.
 * <p>
 * when a Stream is SIZED, it means that we know it's current size, but nothing about it's "children",
 * a Set for example.
 */
private static <T> Optional<Optional<T>> last(Stream<T> stream) {

    Spliterator<T> suffix = stream.spliterator();
    // nothing left to do here
    if (suffix.getExactSizeIfKnown() == 0) {
        return Optional.empty();
    }

    return Optional.of(Optional.ofNullable(compute(suffix, new Holder())));
}


private static <T> T compute(Spliterator<T> sp, Holder holder) {

    Spliterator<T> s;
    while (true) {
        Spliterator<T> prefix = sp.trySplit();
        // we can't split any further
        // BUT don't look at: prefix.getExactSizeIfKnown() == 0 because this
        // does not mean that suffix can't be split even more further down
        if (prefix == null) {
            s = sp;
            break;
        }

        // if prefix is known to have no elements, just drop it and continue with suffix
        if (prefix.getExactSizeIfKnown() == 0) {
            continue;
        }

        // if suffix has no elements, try to split prefix further
        if (sp.getExactSizeIfKnown() == 0) {
            sp = prefix;
        }

        // after a split, a stream that is not SUBSIZED can give birth to a spliterator that is
        if (sp.hasCharacteristics(Spliterator.SUBSIZED)) {
            return compute(sp, holder);
        } else {
            // if we don't know the known size of suffix or prefix, just try walk them individually
            // starting from suffix and see if we find our "last" there
            T suffixResult = compute(sp, holder);
            if (!holder.set) {
                return compute(prefix, holder);
            }
            return suffixResult;
        }


    }

    s.forEachRemaining(holder::accept);
    // we control this, so that Holder::t is only T
    return (T) holder.t;

}

And some usages of it:

    Stream<Integer> st = Stream.concat(Stream.of(1, 2), Stream.empty());
    System.out.println(2 == last(st).get().get());

    st = Stream.concat(Stream.empty(), Stream.of(1, 2));
    System.out.println(2 == last(st).get().get());

    st = Stream.concat(Stream.iterate(0, i -> i + 1), Stream.of(1, 2, 3));
    System.out.println(3 == last(st).get().get());

    st = Stream.concat(Stream.iterate(0, i -> i + 1).limit(0), Stream.iterate(5, i -> i + 1).limit(3));
    System.out.println(7 == last(st).get().get());

    st = Stream.concat(Stream.iterate(5, i -> i + 1).limit(3), Stream.iterate(0, i -> i + 1).limit(0));
    System.out.println(7 == last(st).get().get());

    String s = last(
        IntStream.range(0, 10_000_000).mapToObj(i -> {
            System.out.println("potential heavy operation on " + i);
            return String.valueOf(i);
        }).parallel()
    ).get().get();

    System.out.println(s.equalsIgnoreCase("9999999"));

    st = Stream.empty();
    System.out.println(last(st).isEmpty());

    st = Stream.of(1, 2, 3, 4, null);
    System.out.println(last(st).get().isEmpty());

    st = Stream.of((Integer) null);
    System.out.println(last(st).isPresent());

    IntStream is = IntStream.range(0, 4).filter(i -> i != 3);
    System.out.println(last(is.boxed()));

First is the return type of Optional<Optional<T>> - it looks weird, I agree. If the first Optional is empty it means that there are no elements in the Stream; if the second Optional is empty it means that the element that was last, was actually null, i.e.: Stream.of(1, 2, 3, null) (unlike guava's Streams::findLast that throws an Exception in such a case).

I admit I got inspired mainly by Holger's answer on a similar of mine question and guava's Streams::findLast.

Most efficient way to get the last element of a stream

7 Answers7

Linked

Related