13

Is there a way to find the most common String in an ArrayList?

ArrayList<String> list = new ArrayList<>();
list.add("test");
list.add("test");
list.add("hello");
list.add("test");

Should find the word "test" from this list ["test","test","hello","test"]

Artifex404
  • 106
  • 7
EspenG
  • 537
  • 3
  • 10
  • 20

12 Answers12

16

Don't reinvent the wheel and use the frequency method of the Collections class:

public static int frequency(Collection<?> c, Object o)

Returns the number of elements in the specified collection equal to the specified object. More formally, returns the number of elements e in the collection such that (o == null ? e == null : o.equals(e)).

If you need to count the occurrences for all elements, use a Map and loop cleverly :) Or put your list in a Set and loop on each element of the set with the frequency method above. HTH

EDIT / Java 8: If you fancy a more functional, Java 8 one-liner solution with lambdas, try:

Map<String, Long> occurrences = 
  list.stream().collect(Collectors.groupingBy(w -> w, Collectors.counting()));
VH-NZZ
  • 5,248
  • 4
  • 31
  • 47
  • 1
    Instead of `w->w` use `Function.identity()` – Cemo Jul 27 '16 at 07:39
  • this doesn't return the frequency of each word within each String. It returns the most reoccurring String.. How can i adapt this to return the most frequent words that appear in an ArrayList of Strings – Metin Dagcilar May 04 '17 at 12:02
  • @Cemo I personally find `w -> w` less verbose in a language where more terseness wouldn't hurt much. Any strong argument in favor of `Function.identity()`? – VH-NZZ Oct 09 '17 at 08:21
11

In statistics, this is called the "mode". A vanilla Java 8 solution looks like this:

Stream.of("test","test","hello","test")
      .collect(Collectors.groupingBy(s -> s, Collectors.counting()))
      .entrySet()
      .stream()
      .max(Comparator.comparing(Entry::getValue))
      .ifPresent(System.out::println);

Which yields:

test=3

jOOλ is a library that supports mode() on streams. The following program:

System.out.println(
    Seq.of("test","test","hello","test")
       .mode()
);

Yields:

Optional[test]

(disclaimer: I work for the company behind jOOλ)

Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
9

As per question, Specifically just to get word, not the number of times (i.e. value of key).

String mostRepeatedWord 
    = list.stream()
          .collect(Collectors.groupingBy(w -> w, Collectors.counting()))
          .entrySet()
          .stream()
          .max(Comparator.comparing(Entry::getValue))
          .get()
          .getKey();
ChandraBhan Singh
  • 2,841
  • 1
  • 22
  • 27
6

You can make a HashMap<String,Integer>. If the String already appears in the map, increment its key by one, otherwise, add it to the map.

For example:

put("someValue", 1);

Then, assume it's "someValue" again, you can do:

put("someValue", get("someValue") + 1);

Since the key of "someValue" is 1, now when you put it, the key will be 2.

After that you can easily go through the map and extract the key that has the highest value.

I didn't write a full solution, try to construct one, if you have problems post it in another question. Best practice is to learn by yourself.

Maroun
  • 94,125
  • 30
  • 188
  • 241
2

I think the best way to do it is using maps containing counts.

Map<String, Integer> stringsCount = new HashMap<>();

And iterate over your array filling this map:

for(String s: list)
{
  Integer c = stringsCount.get(s);
  if(c == null) c = new Integer(0);
  c++;
  stringsCount.put(s,c);
}

Finally, you can get the most repeated element iterating over the map:

Map.Entry<String,Integer> mostRepeated = null;
for(Map.Entry<String, Integer> e: stringsCount.entrySet())
{
    if(mostRepeated == null || mostRepeated.getValue()<e.getValue())
        mostRepeated = e;
}

And show the most common string:

if(mostRepeated != null)
        System.out.println("Most common string: " + mostRepeated.getKey());
0

You could use a HashMap<String,Integer>. Looping through the array, you can check for each String if it is not already a Key of your HashMap, add it and set the value to 1, if it is, increase its value by 1.

Then you have a HashMap with all unique Strings and an associated number stating their amount in the array.

LionC
  • 3,106
  • 1
  • 22
  • 31
0

If somebody need to find most popular from usual String[] array (using Lists):

public String findPopular (String[] array) {
    List<String> list = Arrays.asList(array);
    Map<String, Integer> stringsCount = new HashMap<String, Integer>();
    for(String string: list)
    {
        if (string.length() > 0) {
            string = string.toLowerCase();
            Integer count = stringsCount.get(string);
            if(count == null) count = new Integer(0);
            count++;
            stringsCount.put(string,count);
        }
    }
    Map.Entry<String,Integer> mostRepeated = null;
    for(Map.Entry<String, Integer> e: stringsCount.entrySet())
    {
        if(mostRepeated == null || mostRepeated.getValue()<e.getValue())
            mostRepeated = e;
    }
    try {
        return mostRepeated.getKey();
    } catch (NullPointerException e) {
        System.out.println("Cannot find most popular value at the List. Maybe all strings are empty");
        return "";
    }

}
  • case non-sensitive
0

i know this takes more time to implement but you can use heap data structure by storing in the nodes the count and the string information

fatih tekin
  • 959
  • 10
  • 21
0

You can use Guava's Multiset:

ArrayList<String> names = ...

// count names 
HashMultiset<String> namesCounts = HashMultiset.create(names);
Set<Multiset.Entry<String>> namesAndCounts = namesCounts.entrySet();

// find one most common
Multiset.Entry<String> maxNameByCount = Collections.max(namesAndCounts, Comparator.comparing(Multiset.Entry::getCount));

// pick all with the same number of occurrences
List<String> mostCommonNames = new ArrayList<>();
for (Multiset.Entry<String> nameAndCount : namesAndCounts) {
    if (nameAndCount.getCount() == maxNameByCount.getCount()) {
        mostCommonNames.add(nameAndCount.getElement());
    }
}
manicka
  • 204
  • 2
  • 8
0
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;

public class StringChecker {

public static void main(String[] args) {
ArrayList<String> string;
string = new ArrayList<>(Arrays.asList("Mah", "Bob", "mah", "bat", "MAh", "BOb"));
Map<String, Integer> wordMap = new HashMap<String, Integer>();

for (String st : string) {
    String input = st.toUpperCase();
    if (wordMap.get(input) != null) {
        Integer count = wordMap.get(input) + 1;
        wordMap.put(input, count);
    } else {
        wordMap.put(input, 1);
    }
}
System.out.println(wordMap);
Object maxEntry = Collections.max(wordMap.entrySet(), Map.Entry.comparingByValue()).getKey();
System.out.println("maxEntry = " + maxEntry);

}

Mahesh Kshirsagar
  • 390
  • 1
  • 3
  • 12
0

With this method, if there is more than one most common elements in your ArrayList, you get back all of them by adding them to a new ArrayList.

public static void main(String[] args) {

 List <String> words = new ArrayList<>() ; 

words.add("cat") ; 
words.add("dog") ; 
words.add("egg") ; 
words.add("chair") ; 
words.add("chair") ; 
words.add("chair") ; 
words.add("dog") ; 
words.add("dog") ;  

Map<String,Integer> count = new HashMap<>() ; 

    for (String word : words) {  /* Counts the quantity of each 
                                      element */
        if (! count.containsKey(word)) {             
            count.put(word, 1 ) ; 
        }

        else {
            int value = count.get(word) ; 
            value++ ; 

            count.put(word, value) ;
        }       
    }

    List <String> mostCommons = new ArrayList<>() ; /* Max elements  */

    for ( Map.Entry<String,Integer> e : count.entrySet() ) {

        if (e.getValue() == Collections.max(count.values() )){
                            /* The max value of count  */

            mostCommons.add(e.getKey()) ;
        }   
    }

    System.out.println(mostCommons);

 }

}
jaraipali
  • 41
  • 2
  • 8
-1

There are a lot of answers suggesting HashMaps. I really don't like them, because you have to iterate through them once again anyway. Rather, I would sort the List

Collections.sort(list);

and then loop through it. Something similar to

String prev = null, mostCommon=null;
int num = 0, max = 0;
for (String str:list) {
  if (str.equals(prev)) {
    num++;
  } else {
    if (num>max) {
      max = num;
      mostCommon = str;
    }
    num = 1;
    prev = str;
  }
}

should do it.

Bex
  • 2,905
  • 2
  • 33
  • 36
  • *"I really don't like them"* ... Iterating through a set of values twice is still of `O(N)` complexity. Your algorithm uses sorting, which is `O(N log N)` complexity, which is certainly worse. – Lukas Eder Mar 20 '16 at 10:25
  • I agree with Bex. Sort is actually not a bad idea in my opinion. It's easy to implement and has a O(nlogn) time complexity and O(1) space. While use hashmap and priorityqueue has O(nlogn) time complexity and O(n) space. Since insert into a pq is O(logn), iterate the entries of Hashmap and insert them into pq is O(nlogn) in worst cases. Correct me if I'm wrong. http://docs.oracle.com/javase/7/docs/api/java/util/PriorityQueue.html – Poul Jan 28 '17 at 16:41