Stream doesn't have a last()
method:
Stream<T> stream;
T last = stream.last(); // No such method
What's the most elegant and/or efficient way to get the last element (or null for an empty Stream)?
Stream doesn't have a last()
method:
Stream<T> stream;
T last = stream.last(); // No such method
What's the most elegant and/or efficient way to get the last element (or null for an empty Stream)?
Do a reduction that simply returns the current value:
Stream<T> stream;
T last = stream.reduce((a, b) -> b).orElse(null);
This heavily depends on the nature of the Stream
. Keep in mind that “simple” doesn’t necessarily mean “efficient”. If you suspect the stream to be very large, carrying heavy operations or having a source which knows the size in advance, the following might be substantially more efficient than the simple solution:
static <T> T getLast(Stream<T> stream) {
Spliterator<T> sp=stream.spliterator();
if(sp.hasCharacteristics(Spliterator.SIZED|Spliterator.SUBSIZED)) {
for(;;) {
Spliterator<T> part=sp.trySplit();
if(part==null) break;
if(sp.getExactSizeIfKnown()==0) {
sp=part;
break;
}
}
}
T value=null;
for(Iterator<T> it=recursive(sp); it.hasNext(); )
value=it.next();
return value;
}
private static <T> Iterator<T> recursive(Spliterator<T> sp) {
Spliterator<T> prev=sp.trySplit();
if(prev==null) return Spliterators.iterator(sp);
Iterator<T> it=recursive(sp);
if(it!=null && it.hasNext()) return it;
return recursive(prev);
}
You may illustrate the difference with the following example:
String s=getLast(
IntStream.range(0, 10_000_000).mapToObj(i-> {
System.out.println("potential heavy operation on "+i);
return String.valueOf(i);
}).parallel()
);
System.out.println(s);
It will print:
potential heavy operation on 9999999
9999999
In other words, it did not perform the operation on the first 9999999 elements but only on the last one.
Guava has Streams.findLast:
Stream<T> stream;
T last = Streams.findLast(stream);
This is just a refactoring of Holger's answer because the code, while fantastic, is a bit hard to read/understand, especially for people who were not C programmers before Java. Hopefully my refactored example class is a little easier to follow for those who are not familiar with spliterators, what they do, or how they work.
public class LastElementFinderExample {
public static void main(String[] args){
String s = getLast(
LongStream.range(0, 10_000_000_000L).mapToObj(i-> {
System.out.println("potential heavy operation on "+i);
return String.valueOf(i);
}).parallel()
);
System.out.println(s);
}
public static <T> T getLast(Stream<T> stream){
Spliterator<T> sp = stream.spliterator();
if(isSized(sp)) {
sp = getLastSplit(sp);
}
return getIteratorLastValue(getLastIterator(sp));
}
private static boolean isSized(Spliterator<?> sp){
return sp.hasCharacteristics(Spliterator.SIZED|Spliterator.SUBSIZED);
}
private static <T> Spliterator<T> getLastSplit(Spliterator<T> sp){
return splitUntil(sp, s->s.getExactSizeIfKnown() == 0);
}
private static <T> Iterator<T> getLastIterator(Spliterator<T> sp) {
return Spliterators.iterator(splitUntil(sp, null));
}
private static <T> T getIteratorLastValue(Iterator<T> it){
T result = null;
while (it.hasNext()){
result = it.next();
}
return result;
}
private static <T> Spliterator<T> splitUntil(Spliterator<T> sp, Predicate<Spliterator<T>> condition){
Spliterator<T> result = sp;
for (Spliterator<T> part = sp.trySplit(); part != null; part = result.trySplit()){
if (condition == null || condition.test(result)){
result = part;
}
}
return result;
}
}
Here is another solution (not that efficient):
List<String> list = Arrays.asList("abc","ab","cc");
long count = list.stream().count();
list.stream().skip(count-1).findFirst().ifPresent(System.out::println);
Parallel unsized streams with 'skip' methods are tricky and the @Holger's implementation gives a wrong answer. Also @Holger's implementation is a bit slower because it uses iterators.
An optimisation of @Holger answer:
public static <T> Optional<T> last(Stream<? extends T> stream) {
Objects.requireNonNull(stream, "stream");
Spliterator<? extends T> spliterator = stream.spliterator();
Spliterator<? extends T> lastSpliterator = spliterator;
// Note that this method does not work very well with:
// unsized parallel streams when used with skip methods.
// on that cases it will answer Optional.empty.
// Find the last spliterator with estimate size
// Meaningfull only on unsized parallel streams
if(spliterator.estimateSize() == Long.MAX_VALUE) {
for (Spliterator<? extends T> prev = spliterator.trySplit(); prev != null; prev = spliterator.trySplit()) {
lastSpliterator = prev;
}
}
// Find the last spliterator on sized streams
// Meaningfull only on parallel streams (note that unsized was transformed in sized)
for (Spliterator<? extends T> prev = lastSpliterator.trySplit(); prev != null; prev = lastSpliterator.trySplit()) {
if (lastSpliterator.estimateSize() == 0) {
lastSpliterator = prev;
break;
}
}
// Find the last element of the last spliterator
// Parallel streams only performs operation on one element
AtomicReference<T> last = new AtomicReference<>();
lastSpliterator.forEachRemaining(last::set);
return Optional.ofNullable(last.get());
}
Unit testing using junit 5:
@Test
@DisplayName("last sequential sized")
void last_sequential_sized() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed();
stream = stream.skip(50_000).peek(num -> count.getAndIncrement());
assertThat(Streams.last(stream)).hasValue(expected);
assertThat(count).hasValue(9_950_000L);
}
@Test
@DisplayName("last sequential unsized")
void last_sequential_unsized() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed();
stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
stream = stream.skip(50_000).peek(num -> count.getAndIncrement());
assertThat(Streams.last(stream)).hasValue(expected);
assertThat(count).hasValue(9_950_000L);
}
@Test
@DisplayName("last parallel sized")
void last_parallel_sized() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
stream = stream.skip(50_000).peek(num -> count.getAndIncrement());
assertThat(Streams.last(stream)).hasValue(expected);
assertThat(count).hasValue(1);
}
@Test
@DisplayName("getLast parallel unsized")
void last_parallel_unsized() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
stream = stream.peek(num -> count.getAndIncrement());
assertThat(Streams.last(stream)).hasValue(expected);
assertThat(count).hasValue(1);
}
@Test
@DisplayName("last parallel unsized with skip")
void last_parallel_unsized_with_skip() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
stream = stream.skip(50_000).peek(num -> count.getAndIncrement());
// Unfortunately unsized parallel streams does not work very well with skip
//assertThat(Streams.last(stream)).hasValue(expected);
//assertThat(count).hasValue(1);
// @Holger implementation gives wrong answer!!
//assertThat(Streams.getLast(stream)).hasValue(9_950_000L); //!!!
//assertThat(count).hasValue(1);
// This is also not a very good answer better
assertThat(Streams.last(stream)).isEmpty();
assertThat(count).hasValue(0);
}
The only solution to support both's scenarios is to avoid detecting the last spliterator on unsized parallel streams. The consequence is that the solution will perform operations on all elements but it will give always the right answer.
Note that in sequential streams, it will anyway perform operations on all elements.
public static <T> Optional<T> last(Stream<? extends T> stream) {
Objects.requireNonNull(stream, "stream");
Spliterator<? extends T> spliterator = stream.spliterator();
// Find the last spliterator with estimate size (sized parallel streams)
if(spliterator.hasCharacteristics(Spliterator.SIZED|Spliterator.SUBSIZED)) {
// Find the last spliterator on sized streams (parallel streams)
for (Spliterator<? extends T> prev = spliterator.trySplit(); prev != null; prev = spliterator.trySplit()) {
if (spliterator.getExactSizeIfKnown() == 0) {
spliterator = prev;
break;
}
}
}
// Find the last element of the spliterator
//AtomicReference<T> last = new AtomicReference<>();
//spliterator.forEachRemaining(last::set);
//return Optional.ofNullable(last.get());
// A better one that supports native parallel streams
return (Optional<T>) StreamSupport.stream(spliterator, stream.isParallel())
.reduce((a, b) -> b);
}
With regard to the unit testing for that implementation, the first three tests are exactly the same (sequential & sized parallel). The tests for unsized parallel are here:
@Test
@DisplayName("last parallel unsized")
void last_parallel_unsized() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
stream = stream.peek(num -> count.getAndIncrement());
assertThat(Streams.last(stream)).hasValue(expected);
assertThat(count).hasValue(10_000_000L);
}
@Test
@DisplayName("last parallel unsized with skip")
void last_parallel_unsized_with_skip() throws Exception {
long expected = 10_000_000L;
AtomicLong count = new AtomicLong();
Stream<Long> stream = LongStream.rangeClosed(1, expected).boxed().parallel();
stream = StreamSupport.stream(((Iterable<Long>) stream::iterator).spliterator(), stream.isParallel());
stream = stream.skip(50_000).peek(num -> count.getAndIncrement());
assertThat(Streams.last(stream)).hasValue(expected);
assertThat(count).hasValue(9_950_000L);
}
We needed last
of a Stream in production - I am still not sure we really did, but various team members on my team said we did because of various "reasons". I ended up writing something like this:
private static class Holder<T> implements Consumer<T> {
T t = null;
// needed to null elements that could be valid
boolean set = false;
@Override
public void accept(T t) {
this.t = t;
set = true;
}
}
/**
* when a Stream is SUBSIZED, it means that all children (direct or not) are also SIZED and SUBSIZED;
* meaning we know their size "always" no matter how many splits are there from the initial one.
* <p>
* when a Stream is SIZED, it means that we know it's current size, but nothing about it's "children",
* a Set for example.
*/
private static <T> Optional<Optional<T>> last(Stream<T> stream) {
Spliterator<T> suffix = stream.spliterator();
// nothing left to do here
if (suffix.getExactSizeIfKnown() == 0) {
return Optional.empty();
}
return Optional.of(Optional.ofNullable(compute(suffix, new Holder())));
}
private static <T> T compute(Spliterator<T> sp, Holder holder) {
Spliterator<T> s;
while (true) {
Spliterator<T> prefix = sp.trySplit();
// we can't split any further
// BUT don't look at: prefix.getExactSizeIfKnown() == 0 because this
// does not mean that suffix can't be split even more further down
if (prefix == null) {
s = sp;
break;
}
// if prefix is known to have no elements, just drop it and continue with suffix
if (prefix.getExactSizeIfKnown() == 0) {
continue;
}
// if suffix has no elements, try to split prefix further
if (sp.getExactSizeIfKnown() == 0) {
sp = prefix;
}
// after a split, a stream that is not SUBSIZED can give birth to a spliterator that is
if (sp.hasCharacteristics(Spliterator.SUBSIZED)) {
return compute(sp, holder);
} else {
// if we don't know the known size of suffix or prefix, just try walk them individually
// starting from suffix and see if we find our "last" there
T suffixResult = compute(sp, holder);
if (!holder.set) {
return compute(prefix, holder);
}
return suffixResult;
}
}
s.forEachRemaining(holder::accept);
// we control this, so that Holder::t is only T
return (T) holder.t;
}
And some usages of it:
Stream<Integer> st = Stream.concat(Stream.of(1, 2), Stream.empty());
System.out.println(2 == last(st).get().get());
st = Stream.concat(Stream.empty(), Stream.of(1, 2));
System.out.println(2 == last(st).get().get());
st = Stream.concat(Stream.iterate(0, i -> i + 1), Stream.of(1, 2, 3));
System.out.println(3 == last(st).get().get());
st = Stream.concat(Stream.iterate(0, i -> i + 1).limit(0), Stream.iterate(5, i -> i + 1).limit(3));
System.out.println(7 == last(st).get().get());
st = Stream.concat(Stream.iterate(5, i -> i + 1).limit(3), Stream.iterate(0, i -> i + 1).limit(0));
System.out.println(7 == last(st).get().get());
String s = last(
IntStream.range(0, 10_000_000).mapToObj(i -> {
System.out.println("potential heavy operation on " + i);
return String.valueOf(i);
}).parallel()
).get().get();
System.out.println(s.equalsIgnoreCase("9999999"));
st = Stream.empty();
System.out.println(last(st).isEmpty());
st = Stream.of(1, 2, 3, 4, null);
System.out.println(last(st).get().isEmpty());
st = Stream.of((Integer) null);
System.out.println(last(st).isPresent());
IntStream is = IntStream.range(0, 4).filter(i -> i != 3);
System.out.println(last(is.boxed()));
First is the return type of Optional<Optional<T>>
- it looks weird, I agree. If the first Optional
is empty it means that there are no elements in the Stream; if the second Optional is empty it means that the element that was last, was actually null
, i.e.: Stream.of(1, 2, 3, null)
(unlike guava
's Streams::findLast
that throws an Exception in such a case).
I admit I got inspired mainly by Holger's answer on a similar of mine question and guava's Streams::findLast
.