24

Does Java or Guava have something that will return most common element in a list?

List<BigDecimal> listOfNumbers=  new ArrayList<BigDecimal>(); 

[1,3,4,3,4,3,2,3,3,3,3,3]

return 3

Doc Holiday
  • 9,928
  • 32
  • 98
  • 151

10 Answers10

51

In statistics, this is called the "mode". A vanilla Java 8 solution looks like this:

Stream.of(1, 3, 4, 3, 4, 3, 2, 3, 3, 3, 3, 3)
      .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
      .entrySet()
      .stream()
      .max(Map.Entry.comparingByValue())
      .ifPresent(System.out::println);

Which yields:

3=8

jOOλ is a library that supports mode() on streams. The following program:

System.out.println(
    Seq.of(1, 3, 4, 3, 4, 3, 2, 3, 3, 3, 3, 3)
       .mode()
);

Yields:

Optional[3]

For simplicity's sake, I omitted using BigDecimal. The solution would be the same, though.

(disclaimer: I work for the company behind jOOλ)

Community
  • 1
  • 1
Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
29

This is fairly easy to implement yourself:

public static <T> T mostCommon(List<T> list) {
    Map<T, Integer> map = new HashMap<>();

    for (T t : list) {
        Integer val = map.get(t);
        map.put(t, val == null ? 1 : val + 1);
    }

    Entry<T, Integer> max = null;

    for (Entry<T, Integer> e : map.entrySet()) {
        if (max == null || e.getValue() > max.getValue())
            max = e;
    }

    return max.getKey();
}

List<Integer> list = Arrays.asList(1,3,4,3,4,3,2,3,3,3,3,3);
System.out.println(mostCommon(list));
3

If you want to handle cases where there's more then one most frequent element, you can scan the list once to determine how many times the most frequent element(s) occur, and then scan the list again, put those elements in a set and return that.

arshajii
  • 127,459
  • 24
  • 238
  • 287
  • If the input list is empty the the return statement will cause a NullPointerException. Even if you don't expect to ever be empty, something like this would be safer: `return max == null ? null : max.getKey();` – k2col Sep 30 '15 at 23:49
  • This ignores the cases where a list might contain n different elements. In such case there is no most common element. if(map.size() == list.size()){return null;} – gidim Feb 17 '16 at 01:30
15

Probably the simplest solution with Guava looks like

Multiset<BigDecimal> multiset = HashMultiset.create(listOfNumbers);
BigDecimal maxElement = null;
int maxCount = 0;
for (Multiset.Entry<BigDecimal> entry : multiset.entrySet()) {
  if (entry.getCount() > maxCount) {
    maxElement = entry.getElement();
    maxCount = entry.getCount();
  }
}

That's a complete solution, and shorter than the other alternatives I see discussed.

Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413
13

Here is a pure Java 8 solution (note: do not use this one, see below):

List<Integer> theList = Arrays.asList(1, 3, 4, 3, 4, 3, 2, 3, 3, 3, 3, 3);
Integer maxOccurredElement = theList.stream()
        .reduce(BinaryOperator.maxBy((o1, o2) -> Collections.frequency(theList, o1) -
                        Collections.frequency(theList, o2))).orElse(null);
System.out.println(maxOccurredElement);

Another solution, by collecting the elements to a map by their frequency, then finding the entry with max value and returning its key (basically the same solution on arshajii's answer, written using Java 8):

Integer maxVal = theList.stream()
                .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
                .entrySet().stream().max((o1, o2) -> o1.getValue().compareTo(o2.getValue()))
                .map(Map.Entry::getKey).orElse(null);

Update: If the most frequent elements are more than one, and you want to get all of them in a collection, I propose two methods:

Method A: After collecting the original collection to a map with keys as elements and values as their number of occurrences, getting the entry with the maximum value and filtering the map entries with value equal to this max value (if) we found. Something like this:

Map<Integer, Long> elementCountMap = theList.stream()
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
List<Integer> result = elementCountMap.values().stream()
        .max(Long::compareTo).map(maxValue -> elementCountMap.entrySet().stream()
            .filter(entry -> maxValue.equals(entry.getValue())).map(Map.Entry::getKey).collect(Collectors.toList()))
        .orElse(Collections.emptyList());

Method B: After collecting the original collection to a map with keys as elements and values as their number of occurrences, transforming this map into a new map with keys as number of occurences, values as a list of elements with this number of occurences. And then finding the max element of this map with a custom comparator which compares the keys, and getting the value of this entry. Like this:

List<Integer> result = theList.stream().collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
    .entrySet().stream()
    .collect(Collectors.groupingBy(Map.Entry::getValue, Collectors.mapping(Map.Entry::getKey, Collectors.toList())))
    .entrySet().stream().max((o1, o2) -> o1.getKey().compareTo(o2.getKey())).map(Map.Entry::getValue)
    .orElse(Collections.emptyList());
Community
  • 1
  • 1
Utku Özdemir
  • 7,390
  • 2
  • 52
  • 49
  • 2
    This algorithm has `O(N^2)` complexity for something that's feasible in `O(N)`... – Lukas Eder Mar 19 '16 at 18:41
  • 1
    @LukasEder true, calling `Collections.frequency()` over and over is not feasible, I did not consider complexity while writing the answer. Edited and added another solution though, which has `O(N)` complexity. – Utku Özdemir Mar 19 '16 at 19:02
  • Very nice alternative solution! – Lukas Eder Mar 19 '16 at 20:06
  • 2
    I like the second solution.. but does it handle cases when more than one element has the same frequency? I can't tell as I'm new to Java. if not, can it be modified to just return one of several element that have the same number of occurances? – Kai Mar 28 '16 at 14:22
  • @Kai Unfortunately not, it cannot handle the case with more than one elements with same frequency. See here: https://stackoverflow.com/questions/29334404/how-to-force-max-to-return-all-maximum-values-in-a-java-stream But please see my update on the answer, I added solutions to this case as well. – Utku Özdemir Mar 28 '16 at 15:09
  • @UtkuÖzdemir: very nice. Thank you! one question: Method B can return elements that have the same frequency. Why do you still need the last part => orElse(empty list)? is it to guard against cases were the original list is empty or contains one element only? – Kai Mar 28 '16 at 17:41
  • @Kai method `max(comparator)` of a stream (or an optional) returns a `java.util.Optional`, because if no elements pass from the stream, a "maximum element" simply does not exist. So it's only for the case when the input collection is empty. And I prefer to fall to an empty list when there's no max element since it's expected from the operation to return a list, and returning an empty collection is always a bad idea, according to the best practices. – Utku Özdemir Mar 28 '16 at 17:48
9

Guava provides a method that will help, though it's less efficient than Louis's solution.

BigDecimal mostCommon = 
    Multisets.copyHighestCountFirst(ImmutableMultiset.copyOf(listOfNumbers))
        .iterator().next();
Jared Levy
  • 1,986
  • 13
  • 12
3

The classic way to do this is to sort the list and then work through them one by one:

public static BigInteger findMostCommon(List<BigInteger> list) {
    Collections.sort(list);
    BigInteger mostCommon = null;
    BigInteger last = null;
    int mostCount = 0;
    int lastCount = 0;
    for (BigInteger x : list) {
        if (x.equals(last)) {
            lastCount++;
        } else if (lastCount > mostCount) {
            mostCount = lastCount;
            mostCommon = last;
        }
        last = x;
    }
    return mostCommon;
}

This is a bit more space efficient than using a hash to tally counts since it sorts the array in place. You could toss this into a generics class and replace BigInteger with T, or just use Object in place of BigInteger.

Alcanzar
  • 16,985
  • 6
  • 42
  • 59
1

Here is an extension of Louis' answer that support the case where there is multiple elements with same max occurrence count:

private <T> List<T> getMostFrequentElements(List<T> list) {
    Multiset<T> multiset = HashMultiset.create(list);

    List<T> mostFrequents = new ArrayList<>();
    int maxCount = 0;

    for (Multiset.Entry<T> entry : multiset.entrySet()) {
        if (entry.getCount() > maxCount) {
            maxCount = entry.getCount();
            mostFrequents.clear();
            mostFrequents.add(entry.getElement());
        } else if (entry.getCount() == maxCount) {
            mostFrequents.add(entry.getElement());
        }
    }

    return mostFrequents;
}
a.b.d
  • 2,190
  • 3
  • 26
  • 26
1

We can do in only one iteration with ease:

public static Integer mostFrequent(List<Integer> list) {

    if (list == null || list.isEmpty())
        return null;

    Map<Integer, Integer> counterMap = new HashMap<Integer, Integer>();
    Integer maxValue = 0;
    Integer mostFrequentValue = null;

    for(Integer valueAsKey : list) {
        Integer counter = counterMap.get(valueAsKey);
        counterMap.put(valueAsKey, counter == null ? 1 : counter + 1);
        counter = counterMap.get(valueAsKey);
        if (counter > maxValue) {
            maxValue = counter;
            mostFrequentValue = valueAsKey;
        }
    }
    return mostFrequentValue;
}
1

Find most frequent item in collection:

private <V> V findMostFrequentItem(final Collection<V> items)
{
  return items.stream()
      .filter(Objects::nonNull)
      .collect(Collectors.groupingBy(Functions.identity(), Collectors.counting())).entrySet().stream()
      .max(Comparator.comparing(Entry::getValue))
      .map(Entry::getKey)
      .orElse(null);
}
nejckorasa
  • 589
  • 3
  • 8
  • 17
0

If you are willing to use Google Guava, you can use its MultiSet classes:

MultiSet<BigNumber> numbers = HashMultiSet.create();
numberSet.addAll(list);
Set<MultiSet.Entry<BigNumber>> pairs = numbers.emtrySet();
Set<MultiSet.Entry<BigNumber>> copies = new HashSet<MultiSet.Entry<BigNumber>>(pairs);

Now, sort copies by its values descending.

Eric Jablow
  • 7,874
  • 2
  • 22
  • 29