50

I've tested a bit the max function on Java 8 lambdas and streams, and it seems that in case max is executed, even if more than one object compares to 0, it returns an arbitrary element within the tied candidates without further consideration.

Is there an evident trick or function for such a max expected behavior, so that all max values are returned? I don't see anything in the API but I am sure it must exist something better than comparing manually.

For instance:

// myComparator is an IntegerComparator
Stream.of(1, 3, 5, 3, 2, 3, 5)
    .max(myComparator)
    .forEach(System.out::println);
// Would print 5, 5 in any order.
Zabuzard
  • 25,064
  • 8
  • 58
  • 82
Whimusical
  • 6,401
  • 11
  • 62
  • 105
  • 2
    What exactly do you want this `max` method to return? A `HashSet`? – Paul Boddington Mar 29 '15 at 20:37
  • A hashset or a way to chain more operations as if it was a filter, with the tied maxes – Whimusical Mar 29 '15 at 21:30
  • 12
    Your statement is incorrect. If the stream is ordered (such as the streams you get from an array or List), it returns the *first* element that is maximal in the event of multiple maximal elements; only if the stream is unordered is it allowed to pick an arbitrary element. – Brian Goetz Mar 29 '15 at 22:51
  • thanks for the point, I didn't think of – Whimusical Mar 30 '15 at 19:11
  • @BrianGoetz Nothing in the [`max`](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/stream/Stream.html#max(java.util.Comparator)) Javadoc indicates which element would be chosen in the case of a tie. Would I be correct in assuming that assuming the first element be the chosen one would not be something one should rely on, even if the current implementation works that way? – M. Justin Mar 31 '22 at 05:13
  • 1
    @M.Justin If the spec doesn't say it, it would be unwise to count on it, even if that's what one particular implementation does today. But more importantly, why are you distinguishing between two objects that the comparator deems equal? Maybe you are using the wrong comparator then? – Brian Goetz Mar 31 '22 at 13:09
  • @BrianGoetz Yep. I guess I'm mostly calling out that your statement "only if the stream is unordered is it allowed to pick an arbitrary element" doesn't appear to be accurate, or at least not behavior developers should ever rely on. – M. Justin Apr 01 '22 at 17:23

9 Answers9

55

I believe the OP is using a Comparator to partition the input into equivalence classes, and the desired result is a list of members of the equivalence class that is the maximum according to that Comparator.

Unfortunately, using int values as a sample problem is a terrible example. All equal int values are fungible, so there is no notion of preserving the ordering of equivalent values. Perhaps a better example is using string lengths, where the desired result is to return a list of strings from an input that all have the longest length within that input.

I don't know of any way to do this without storing at least partial results in a collection.

Given an input collection, say

List<String> list = ... ;

...it's simple enough to do this in two passes, the first to get the longest length, and the second to filter the strings that have that length:

int longest = list.stream()
                  .mapToInt(String::length)
                  .max()
                  .orElse(-1);

List<String> result = list.stream()
                          .filter(s -> s.length() == longest)
                          .collect(toList());

If the input is a stream, which cannot be traversed more than once, it is possible to compute the result in only a single pass using a collector. Writing such a collector isn't difficult, but it is a bit tedious as there are several cases to be handled. A helper function that generates such a collector, given a Comparator, is as follows:

static <T> Collector<T,?,List<T>> maxList(Comparator<? super T> comp) {
    return Collector.of(
        ArrayList::new,
        (list, t) -> {
            int c;
            if (list.isEmpty() || (c = comp.compare(t, list.get(0))) == 0) {
                list.add(t);
            } else if (c > 0) {
                list.clear();
                list.add(t);
            }
        },
        (list1, list2) -> {
            if (list1.isEmpty()) {
                return list2;
            } 
            if (list2.isEmpty()) {
                return list1;
            }
            int r = comp.compare(list1.get(0), list2.get(0));
            if (r < 0) {
                return list2;
            } else if (r > 0) {
                return list1;
            } else {
                list1.addAll(list2);
                return list1;
            }
        });
}

This stores intermediate results in an ArrayList. The invariant is that all elements within any such list are equivalent in terms of the Comparator. When adding an element, if it's less than the elements in the list, it's ignored; if it's equal, it's added; and if it's greater, the list is emptied and the new element is added. Merging isn't too difficult either: the list with the greater elements is returned, but if their elements are equal the lists are appended.

Given an input stream, this is pretty easy to use:

Stream<String> input = ... ;

List<String> result = input.collect(maxList(comparing(String::length)));
Naman
  • 27,789
  • 26
  • 218
  • 353
Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
  • 6
    Great answer. Writing your own collectors is really powerful and actually quite straightforward, once you get your head around the supplier/accumulator/combiner/finisher nomenclature! – batwad Feb 19 '18 at 11:14
28

I would group by value and store the values into a TreeMap in order to have my values sorted, then I would get the max value by getting the last entry as next:

Stream.of(1, 3, 5, 3, 2, 3, 5)
    .collect(groupingBy(Function.identity(), TreeMap::new, toList()))
    .lastEntry()
    .getValue()
    .forEach(System.out::println);

Output:

5
5
Nicolas Filotto
  • 43,537
  • 11
  • 94
  • 122
11

I implemented more generic collector solution with custom downstream collector. Probably some readers might find it useful:

public static <T, A, D> Collector<T, ?, D> maxAll(Comparator<? super T> comparator, 
                                                  Collector<? super T, A, D> downstream) {
    Supplier<A> downstreamSupplier = downstream.supplier();
    BiConsumer<A, ? super T> downstreamAccumulator = downstream.accumulator();
    BinaryOperator<A> downstreamCombiner = downstream.combiner();
    class Container {
        A acc;
        T obj;
        boolean hasAny;
        
        Container(A acc) {
            this.acc = acc;
        }
    }
    Supplier<Container> supplier = () -> new Container(downstreamSupplier.get());
    BiConsumer<Container, T> accumulator = (acc, t) -> {
        if(!acc.hasAny) {
            downstreamAccumulator.accept(acc.acc, t);
            acc.obj = t;
            acc.hasAny = true;
        } else {
            int cmp = comparator.compare(t, acc.obj);
            if (cmp > 0) {
                acc.acc = downstreamSupplier.get();
                acc.obj = t;
            }
            if (cmp >= 0)
                downstreamAccumulator.accept(acc.acc, t);
        }
    };
    BinaryOperator<Container> combiner = (acc1, acc2) -> {
        if (!acc2.hasAny) {
            return acc1;
        }
        if (!acc1.hasAny) {
            return acc2;
        }
        int cmp = comparator.compare(acc1.obj, acc2.obj);
        if (cmp > 0) {
            return acc1;
        }
        if (cmp < 0) {
            return acc2;
        }
        acc1.acc = downstreamCombiner.apply(acc1.acc, acc2.acc);
        return acc1;
    };
    Function<Container, D> finisher = acc -> downstream.finisher().apply(acc.acc);
    return Collector.of(supplier, accumulator, combiner, finisher);
}

So by default it can be collected to a list using:

public static <T> Collector<T, ?, List<T>> maxAll(Comparator<? super T> comparator) {
    return maxAll(comparator, Collectors.toList());
}

But you can use other downstream collectors as well:

public static String joinLongestStrings(Collection<String> input) {
    return input.stream().collect(
            maxAll(Comparator.comparingInt(String::length), Collectors.joining(","))));
}
Naman
  • 27,789
  • 26
  • 218
  • 353
Tagir Valeev
  • 97,161
  • 19
  • 222
  • 334
4

If I understood well, you want the frequency of the max value in the Stream.

One way to achieve that would be to store the results in a TreeMap<Integer, List<Integer> when you collect elements from the Stream. Then you grab the last key (or first depending on the comparator you give) to get the value which will contains the list of max values.

List<Integer> maxValues = st.collect(toMap(i -> i,
                     Arrays::asList,
                     (l1, l2) -> Stream.concat(l1.stream(), l2.stream()).collect(toList()),
                     TreeMap::new))
             .lastEntry()
             .getValue();

Collecting it from the Stream(4, 5, -2, 5, 5) will give you a List [5, 5, 5].

Another approach in the same spirit would be to use a group by operation combined with the counting() collector:

Entry<Integer, Long> maxValues = st.collect(groupingBy(i -> i,
                TreeMap::new,
                counting())).lastEntry(); //5=3 -> 5 appears 3 times

Basically you firstly get a Map<Integer, List<Integer>>. Then the downstream counting() collector will return the number of elements in each list mapped by its key resulting in a Map. From there you grab the max entry.

The first approaches require to store all the elements from the stream. The second one is better (see Holger's comment) as the intermediate List is not built. In both approached, the result is computed in a single pass.

If you get the source from a collection, you may want to use Collections.max one time to find the maximum value followed by Collections.frequency to find how many times this value appears.

It requires two passes but uses less memory as you don't have to build the data-structure.

The stream equivalent would be coll.stream().max(...).get(...) followed by coll.stream().filter(...).count().

Alexis C.
  • 91,686
  • 21
  • 171
  • 177
  • 1
    “Both approaches require to store all the elements from the stream” that’s a misleading statement considering that the second approach stores only *counts* rather than lists of elements. That’s the whole point of providing another `Collector` rather than creating a `Map<…, List…>` first. It has to *process* every item but will *not store* the items. – Holger Mar 31 '15 at 09:27
  • 1
    @Holger Thanks, I don't know why I assumed that. – Alexis C. Mar 31 '15 at 09:47
2

I'm not really sure whether you are trying to

  • (a) find the number of occurrences of the maximum item, or
  • (b) Find all the maximum values in the case of a Comparator that is not consistent with equals.

An example of (a) would be [1, 5, 4, 5, 1, 1] -> [5, 5].

An example of (b) would be:

Stream.of("Bar", "FOO", "foo", "BAR", "Foo")
      .max((s, t) -> s.toLowerCase().compareTo(t.toLowerCase()));

which you want to give [Foo, foo, Foo], rather than just FOO or Optional[FOO].

In both cases, there are clever ways to do it in just one pass. But these approaches are of dubious value because you would need to keep track of unnecessary information along the way. For example, if you start with [2, 0, 2, 2, 1, 6, 2], it would only be when you reach 6 that you would realise it was not necessary to track all the 2s.

I think the best approach is the obvious one; use max, and then iterate the items again putting all the ties into a collection of your choice. This will work for both (a) and (b).

Paul Boddington
  • 37,127
  • 10
  • 65
  • 116
  • yes, it is getting the max occurences (not index) given a comparator (which does not have any tie breaking system). Preferrably in a stream fashion (i mean the output is a stream of those max elements, like if it was a filter, not a collector or a ending operation) – Whimusical Mar 29 '15 at 21:36
  • 3
    @user1352530 I think you're asking the wrong question. People used to ask "how can I do X?". Now they ask "how can I do X using streams?". It's the wrong question. The objective is to do X, not to use streams. – Paul Boddington Mar 29 '15 at 21:49
  • I am just learning the approaches, not a real need. I could workaround it, but I'm sure there must be a efficient or shortcut for this. Also, take a look at the post update – Whimusical Mar 29 '15 at 22:08
  • 2
    @user1352530: you can’t single-pass stream like a `filter` operation as you have to process every element including the last one before you definitely know what the max element is. Just consider the possibility that the very last element is the sole maximum value. Even if you find a way to hide this fact in something that looks like a single stream operation (like with `Stream.sorted`) it will imply processing/collecting all element before a downstream operation can take place. – Holger Mar 31 '15 at 09:19
2

If you'd rather rely on a library than the other answers here, StreamEx has a collector to do this.

Stream.of(1, 3, 5, 3, 2, 3, 5)
    .collect(MoreCollectors.maxAll())
    .forEach(System.out::println);

There's a version which takes a Comparator too for streams of items which don't have a natural ordering (i.e. don't implement Comparable).

Michael
  • 41,989
  • 11
  • 82
  • 128
1
System.out.println(
  Stream.of(1, 3, 5, 3, 2, 3, 5)
    .map(a->new Integer[]{a})
    .reduce((a,b)-> 
        a[0]==b[0]?
            Stream.concat(Stream.of(a),Stream.of(b)).toArray() :
            a[0]>b[0]? a:b
    ).get()
)
Chayne P. S.
  • 1,558
  • 12
  • 17
1

I was searching for a good answer on this question, but a tad more complex and couldn't find anything until I figured it out myself, which is why I'm posting if this helps anybody.

I have a list of Kittens. Kitten is an object which has a name, age and gender. I had to return a list of all the youngest kittens.

For example: So kitten list would contain kitten objects (k1, k2, k3, k4) and their ages would be (1, 2, 3, 1) accordingly. We want to return [k1, k4], because they are both the youngest. If only one youngest exists, the function should return [k1(youngest)].

  1. Find the min value of the list (if it exists):

     Optional<Kitten> minKitten = kittens.stream().min(Comparator.comparingInt(Kitten::getAge));
    
  2. filter the list by the min value

     return minKitten.map(value -> kittens.stream().filter(kitten -> kitten.getAge() == value.getAge())
           .collect(Collectors.toList())).orElse(Collections.emptyList());
    
Suraj Rao
  • 29,388
  • 11
  • 94
  • 103
Johanna
  • 11
  • 1
  • 1
    It’s surprising that you didn’t find a good answer, as [this answer](https://stackoverflow.com/a/29339106/2711488) is precisely providing what you wanted; you only needed to adapt it to your case, kitten with minimum age instead of string with maximum length, i.e. `int youngest = kittens.stream() .mapToInt(Kitten::getAge) .min() .orElse(-1); List result = kittens.stream() .filter(kitten -> kitten.getAge() == youngest) .collect(toList());` You’re now doing basically the same, just with an `Optional`. – Holger Jun 08 '21 at 16:51
1

The following two lines will do it without implementing a separate comparator:

  List<Integer> list = List.of(1, 3, 5, 3, 2, 3, 5);
  list.stream().filter(i -> i == (list.stream().max(Comparator.comparingInt(i2 -> i2))).get()).forEach(System.out::println);
Andy
  • 488
  • 4
  • 13
  • 2
    This leads to a horrible performance for larger lists, as you are repeating the search for the max element for every element. Also known as quadratic time complexity. – Holger Jun 08 '21 at 16:42
  • This answer is really . Performance aside, as that's overrated for many situations unless truly working on an enterprise scale. If it's a concern then other answers below show 2-step method for the same. – CodeFinity Feb 02 '22 at 00:26