Does Java or Guava have something that will return most common element in a list?
List<BigDecimal> listOfNumbers= new ArrayList<BigDecimal>();
[1,3,4,3,4,3,2,3,3,3,3,3]
return 3
Does Java or Guava have something that will return most common element in a list?
List<BigDecimal> listOfNumbers= new ArrayList<BigDecimal>();
[1,3,4,3,4,3,2,3,3,3,3,3]
return 3
In statistics, this is called the "mode". A vanilla Java 8 solution looks like this:
Stream.of(1, 3, 4, 3, 4, 3, 2, 3, 3, 3, 3, 3)
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet()
.stream()
.max(Map.Entry.comparingByValue())
.ifPresent(System.out::println);
Which yields:
3=8
jOOλ is a library that supports mode()
on streams. The following program:
System.out.println(
Seq.of(1, 3, 4, 3, 4, 3, 2, 3, 3, 3, 3, 3)
.mode()
);
Yields:
Optional[3]
For simplicity's sake, I omitted using BigDecimal
. The solution would be the same, though.
(disclaimer: I work for the company behind jOOλ)
This is fairly easy to implement yourself:
public static <T> T mostCommon(List<T> list) {
Map<T, Integer> map = new HashMap<>();
for (T t : list) {
Integer val = map.get(t);
map.put(t, val == null ? 1 : val + 1);
}
Entry<T, Integer> max = null;
for (Entry<T, Integer> e : map.entrySet()) {
if (max == null || e.getValue() > max.getValue())
max = e;
}
return max.getKey();
}
List<Integer> list = Arrays.asList(1,3,4,3,4,3,2,3,3,3,3,3);
System.out.println(mostCommon(list));
3
If you want to handle cases where there's more then one most frequent element, you can scan the list once to determine how many times the most frequent element(s) occur, and then scan the list again, put those elements in a set and return that.
Probably the simplest solution with Guava looks like
Multiset<BigDecimal> multiset = HashMultiset.create(listOfNumbers);
BigDecimal maxElement = null;
int maxCount = 0;
for (Multiset.Entry<BigDecimal> entry : multiset.entrySet()) {
if (entry.getCount() > maxCount) {
maxElement = entry.getElement();
maxCount = entry.getCount();
}
}
That's a complete solution, and shorter than the other alternatives I see discussed.
Here is a pure Java 8 solution (note: do not use this one, see below):
List<Integer> theList = Arrays.asList(1, 3, 4, 3, 4, 3, 2, 3, 3, 3, 3, 3);
Integer maxOccurredElement = theList.stream()
.reduce(BinaryOperator.maxBy((o1, o2) -> Collections.frequency(theList, o1) -
Collections.frequency(theList, o2))).orElse(null);
System.out.println(maxOccurredElement);
Another solution, by collecting the elements to a map by their frequency, then finding the entry with max value and returning its key (basically the same solution on arshajii's answer, written using Java 8):
Integer maxVal = theList.stream()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet().stream().max((o1, o2) -> o1.getValue().compareTo(o2.getValue()))
.map(Map.Entry::getKey).orElse(null);
Update: If the most frequent elements are more than one, and you want to get all of them in a collection, I propose two methods:
Method A: After collecting the original collection to a map with keys as elements and values as their number of occurrences, getting the entry with the maximum value and filtering the map entries with value equal to this max value (if) we found. Something like this:
Map<Integer, Long> elementCountMap = theList.stream()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
List<Integer> result = elementCountMap.values().stream()
.max(Long::compareTo).map(maxValue -> elementCountMap.entrySet().stream()
.filter(entry -> maxValue.equals(entry.getValue())).map(Map.Entry::getKey).collect(Collectors.toList()))
.orElse(Collections.emptyList());
Method B: After collecting the original collection to a map with keys as elements and values as their number of occurrences, transforming this map into a new map with keys as number of occurences, values as a list of elements with this number of occurences. And then finding the max element of this map with a custom comparator which compares the keys, and getting the value of this entry. Like this:
List<Integer> result = theList.stream().collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet().stream()
.collect(Collectors.groupingBy(Map.Entry::getValue, Collectors.mapping(Map.Entry::getKey, Collectors.toList())))
.entrySet().stream().max((o1, o2) -> o1.getKey().compareTo(o2.getKey())).map(Map.Entry::getValue)
.orElse(Collections.emptyList());
Guava provides a method that will help, though it's less efficient than Louis's solution.
BigDecimal mostCommon =
Multisets.copyHighestCountFirst(ImmutableMultiset.copyOf(listOfNumbers))
.iterator().next();
The classic way to do this is to sort the list and then work through them one by one:
public static BigInteger findMostCommon(List<BigInteger> list) {
Collections.sort(list);
BigInteger mostCommon = null;
BigInteger last = null;
int mostCount = 0;
int lastCount = 0;
for (BigInteger x : list) {
if (x.equals(last)) {
lastCount++;
} else if (lastCount > mostCount) {
mostCount = lastCount;
mostCommon = last;
}
last = x;
}
return mostCommon;
}
This is a bit more space efficient than using a hash to tally counts since it sorts the array in place. You could toss this into a generics class and replace BigInteger with T, or just use Object in place of BigInteger.
Here is an extension of Louis' answer that support the case where there is multiple elements with same max occurrence count:
private <T> List<T> getMostFrequentElements(List<T> list) {
Multiset<T> multiset = HashMultiset.create(list);
List<T> mostFrequents = new ArrayList<>();
int maxCount = 0;
for (Multiset.Entry<T> entry : multiset.entrySet()) {
if (entry.getCount() > maxCount) {
maxCount = entry.getCount();
mostFrequents.clear();
mostFrequents.add(entry.getElement());
} else if (entry.getCount() == maxCount) {
mostFrequents.add(entry.getElement());
}
}
return mostFrequents;
}
We can do in only one iteration with ease:
public static Integer mostFrequent(List<Integer> list) {
if (list == null || list.isEmpty())
return null;
Map<Integer, Integer> counterMap = new HashMap<Integer, Integer>();
Integer maxValue = 0;
Integer mostFrequentValue = null;
for(Integer valueAsKey : list) {
Integer counter = counterMap.get(valueAsKey);
counterMap.put(valueAsKey, counter == null ? 1 : counter + 1);
counter = counterMap.get(valueAsKey);
if (counter > maxValue) {
maxValue = counter;
mostFrequentValue = valueAsKey;
}
}
return mostFrequentValue;
}
Find most frequent item in collection:
private <V> V findMostFrequentItem(final Collection<V> items)
{
return items.stream()
.filter(Objects::nonNull)
.collect(Collectors.groupingBy(Functions.identity(), Collectors.counting())).entrySet().stream()
.max(Comparator.comparing(Entry::getValue))
.map(Entry::getKey)
.orElse(null);
}
If you are willing to use Google Guava, you can use its MultiSet
classes:
MultiSet<BigNumber> numbers = HashMultiSet.create();
numberSet.addAll(list);
Set<MultiSet.Entry<BigNumber>> pairs = numbers.emtrySet();
Set<MultiSet.Entry<BigNumber>> copies = new HashSet<MultiSet.Entry<BigNumber>>(pairs);
Now, sort copies
by its values descending.