4

I have a list of tuples, and I want to find the tuples with the largest x value. In the case where there are multiple largest x values, I want to choose one at random. I can't figure out how to implement this random selection functionality. Below is the code I have so far:

public void testSelectRandomFromLargestVals() {
    List<Tuple<Integer, String>> list = new ArrayList<>();
    list.add(new Tuple<>(5, "five-1"));
    list.add(new Tuple<>(2, "two"));
    list.add(new Tuple<>(3, "three"));
    list.add(new Tuple<>(5, "five-2"));
    list.add(new Tuple<>(5, "five-3"));

    Optional<Tuple<Integer, String>> largestTuple = list.stream().max((t1, t2) -> Integer.compare(t1.x, t2.x));
    System.out.println("Largest tuple is: " + largestTuple.get().x + " value is: " + largestTuple.get().y);
}

public class Tuple<X, Y> {
    public final X x;
    public final Y y;
    public Tuple(X x, Y y) {
        this.x = x;
        this.y = y;
    }
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Tuple<?, ?> tuple = (Tuple<?, ?>) o;
        if (!x.equals(tuple.x)) return false;
        return y.equals(tuple.y);
    }
    @Override
    public int hashCode() {
        int result = x.hashCode();
        result = 31 * result + y.hashCode();
        return result;
    }
}
Alexis C.
  • 91,686
  • 21
  • 171
  • 177
jcm
  • 5,499
  • 11
  • 49
  • 78
  • 2
    http://stackoverflow.com/questions/29334404/how-to-force-max-to-return-all-maximum-values-in-a-java-stream/29339106 – Alexis C. Mar 14 '16 at 20:16
  • 2
    Just to be clear, you do really mean "at random", as opposed to "arbitrarily"? – Andy Turner Mar 14 '16 at 20:23
  • 1
    Your equals method would be more safely implements using `Objects.equal(x, tuple.x)` instead of `x.equals(tuple.x)` (similarly `Objects.hash(x, y)` in `hashCode`). – Andy Turner Mar 14 '16 at 20:25
  • @AndyTurner yes, random. The equals and hashcode methods were auto-generated by IntelliJ – jcm Mar 14 '16 at 20:26
  • 2
    Then IntelliJ should add null checks in the constructor :) Notwithstanding that, `return x.equals(tuple.x) && y.equals(tuple.y)` is a much nicer way to express the `equals` condition. – Andy Turner Mar 14 '16 at 20:26
  • 2
    @AndyTurner Regarding IntelliJ, you can tell it to add null checks on equals and hashCode or not – fps Mar 14 '16 at 20:29

6 Answers6

6

It turns out that Misha's one-pass random selector (nice work, +1) can be combined with my one-pass max-value collector from this other answer into a single collector. This allows a random element from the set of maximal ones to be selected in a single pass.

Here's the merged collector:

static <T> Collector<T, ?, Optional<T>> rndMax(Comparator<? super T> cmp) {
    class RndMax {
        T val;
        int cnt;

        void add(T t) {
            int c;
            if (cnt == 0 || (c = cmp.compare(t, val)) > 0) {
                cnt = 1;
                val = t;
            } else if (c == 0) {
                cnt++;
                if (ThreadLocalRandom.current().nextInt(cnt) == 0) {
                    val = t;
                }
            }
        }

        RndMax merge(RndMax other) {
            if (cnt == 0) {
                return other;
            }

            if (other.cnt == 0) {
                return this;
            }

            int c = cmp.compare(val, other.val);
            if (c < 0) {
                return other;
            } else if (c > 0) {
                return this;
            } else {
                cnt += other.cnt;
                if (ThreadLocalRandom.current().nextInt(cnt) < other.cnt) {
                    val = other.val;
                }
                return this;
            }
        }

        Optional<T> finish() {
            return cnt == 0 ? Optional.empty() : Optional.of(val);
        }
    }

    return Collector.of(RndMax::new, RndMax::add, RndMax::merge, RndMax::finish);
}

You'd invoke it like this:

List<Tuple<Integer,String>> list = ... ;
Optional<Tuple<Integer,String>> max =
    list.stream().collect(rndMax(Comparator.comparingInt(t -> t.x)));
Community
  • 1
  • 1
Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
  • 2
    Another approach would be to make a `groupMax` collector that accepts a downstream collector. Then it can be composed with `random()` to solve this problem or `toList()` to solve the problem in the other question you linked. – Misha Mar 15 '16 at 04:55
  • 2
    Yeah I was thinking of that but this one seemed complicated enough to start with. :-) – Stuart Marks Mar 15 '16 at 05:31
5

Simple answer is to shuffle first:

List<Tuple<Integer, String>> copy = new ArrayList<>(list);
Collections.shuffle(copy);
Tuple<Integer, String> largestTuple = Collections.max(copy, Comparator.comparingInt(t -> t.x)); // credit @Holger

The element returned by max() is influenced by encounter order, so shuffling effectively makes it random.

If the list is not too big (not thousands of elements), the shuffle will be pretty fast.

I made a copy of the list so as to not affect the order of the original list. If that isn't important, just shuffle as use the original list.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • Hmm ok, what if instead of a List> I have a Set> ? – jcm Mar 14 '16 at 20:37
  • 2
    @jcm you can only shuffle a `List`, but you can pass any Collection type to the constructor of `ArrayList` to make a copy, so this code will work as-is with a `Set` too. – Bohemian Mar 14 '16 at 20:40
  • the Set can be quite large so I was hoping to not have to make a copy of it. My ideal solution would be to implement the random selection as part of the stream processing code. – jcm Mar 14 '16 at 20:42
  • 1
    @jcm how large exactly? – Bohemian Mar 14 '16 at 20:43
  • Potentially thousands of elements – jcm Mar 14 '16 at 20:48
  • 1
    @jcm I just did a quick benchmark on my average laptop: To copy and shuffle a random collection of integers of size 10K takes about 0.5ms, size 1K about 0.05ms and size 100K about 6ms. Unless you're doing this continuously, I doubt you'll feel it. A user definitely won't notice it. – Bohemian Mar 14 '16 at 20:52
  • 3
    If you’re operating on a shuffled list, there is no need for using the stream API at all. Just use [`Collections.max(copy, comparingInt(t->t.x))`](https://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#max-java.util.Collection-java.util.Comparator-) – Holger Mar 15 '16 at 07:45
  • @jcm I wrote a simple test harness, allowed warm up, used random data, ran several times. – Bohemian Mar 18 '16 at 01:46
5

For most practical purposes, @Bohemian's shuffle-based solution is the best, but since you expressed an interest in an approach that's constant in memory, you can do it with a custom Collector that chooses a random element:

public static <T> Collector<T, ?, Optional<T>> random() {
    class Rnd {

        T val;
        int cnt;

        void add(T t) {
            cnt++;
            if (ThreadLocalRandom.current().nextInt(cnt) == 0) {
                val = t;
            }
        }

        Rnd merge(Rnd other) {
            cnt += other.cnt;
            if (ThreadLocalRandom.current().nextInt(cnt) < other.cnt) {
                val = other.val;
            }
            return this;
        }

        Optional<T> finish() {
            return cnt == 0 ? Optional.empty() : Optional.of(val);
        }

    }

    return Collector.of(Rnd::new, Rnd::add, Rnd::merge, Rnd::finish);
}

With that, you can make one pass to find the largest x and another to pick a random matching tuple:

int largestX = list.stream().mapToInt(t -> t.x).max()
        .getAsInt();  // or throw if list is empty

Tuple<Integer, String> randomLargestTuple = list.stream()
        .filter(t -> largestX == t.x)
        .collect(random())
        .get();
Misha
  • 27,433
  • 6
  • 62
  • 78
2

Might not be an optimal solution

    final Optional<Tuple<Integer, String>> largestTuple = list.stream().max((t1, t2) -> Integer.compare(t1.x, t2.x));
    final List<Tuple<Integer, String>> allMaxElements = list.stream().filter(z -> z.x == largestTuple.get().x).collect(Collectors.toList());
    final Tuple<Integer, String> randomMaxTuple = allMaxElements.get(new SecureRandom().nextInt(allMaxElements.size()));
    System.out.println("Largest tuple is: " + randomMaxTuple.x + " value is: " + randomMaxTuple.y);

On the equals and hashCode methods you can utilize guava

@Override
public int hashCode()
{
    return com.google.common.base.Objects.hashCode(x, y);
}

@Override
public boolean equals(final Object object)
{
    if (object instanceof Tuple) {
        final Tuple<?, ?> that = (Tuple<?, ?>) object;
        return com.google.common.base.Objects.equal(this.x, that.x) && com.google.common.base.Objects.equal(this.y, that.y);
    }
    return false;
}   
SDS
  • 828
  • 3
  • 17
  • 35
  • 3
    fyi java 8 has made those guava methods obsolete: `java.util.Objects.hash()` and `java.util.Objects.equals()` do exactly the same thing. – Bohemian Mar 14 '16 at 20:57
  • good to know, I tried googling and couldn't find a link that would describe this in detail, can you share any? – SDS Mar 14 '16 at 21:08
  • 1
    Always try [the official documentation](https://docs.oracle.com/javase/8/docs/api/) first. [`Objects.equals`](https://docs.oracle.com/javase/8/docs/api/java/util/Objects.html#equals-java.lang.Object-java.lang.Object-) • [`Objects.hash`](https://docs.oracle.com/javase/8/docs/api/java/util/Objects.html#hash-java.lang.Object...-) – Holger Mar 15 '16 at 08:57
1

You could collect your tuples to a TreeMap, with the key being each possible x value from the tuples and the values being a list of all tuples that have that x value:

TreeMap<Integer, List<Tuple<Integer, String>>> map = list.stream()
    .collect(Collectors.groupingBy(
        t -> t.x, 
        TreeMap::new, 
        Collectors.toList()));

Then, you could grab the entry with the maximum key by using the TreeMap.lastEntry() method:

List<Tuple<Integer, String>> largestTuples = map.lastEntry().getValue();

Then, simply pick one element randomly from the largestTuples list.

fps
  • 33,623
  • 8
  • 55
  • 110
0

@Misha's custom Collector can also be implemented with an anonymous type, instead of a local class:

public static <T> Collector<T, ?, Optional<T>> random() {
  return Collector.of(
    () -> new Object() {
      T val;
      int cnt;
    },
    (this_, t) -> {
      this_.cnt++;
      if (ThreadLocalRandom.current().nextInt(this_.cnt) == 0) {
        this_.val = t;
      }
    },
    (this_, other) -> {
      this_.cnt += other.cnt;
      if (ThreadLocalRandom.current().nextInt(this_.cnt) < other.cnt) {
        this_.val = other.val;
      }
      return this_;
    },
    this_ -> this_.cnt == 0
      ? Optional.empty()
      : Optional.of(this_.val)
  );
}
srborlongan
  • 4,460
  • 4
  • 26
  • 33