32

Given a collection of objects with possible duplicates, I'd like end up with a count of occurrences per object. I do it by initializing an empty Map, then iterating through the Collection and mapping the object to its count (incrementing the count each time the map already contains the object).

public Map<Object, Integer> countOccurrences(Collection<Object> list) {
    Map<Object, Integer> occurrenceMap = new HashMap<Object, Integer>();
    for (Object obj : list) {
        Integer numOccurrence = occurrenceMap.get(obj);
        if (numOccurrence == null) {
            //first count
            occurrenceMap.put(obj, 1);
        } else {
            occurrenceMap.put(obj, numOccurrence++);
        }
    }
    return occurrenceMap;
}

This looks too verbose for a simple logic of counting occurrences. Is there a more elegant/shorter way of doing this? I'm open to a completely different algorithm or a java language specific feature that allows for a shorter code.

Community
  • 1
  • 1
fo_x86
  • 2,583
  • 1
  • 30
  • 41
  • 6
    Counting occurrences is not that simple after all, your code seems to be the best you can do. – Henry Jan 10 '13 at 14:34
  • 1
    In order to have a complete list of occurrences of all elements, you will have to traverse the whole collection and I think your implementation is a decent one. – hovanessyan Jan 10 '13 at 14:35
  • 1
    What makes you think this is verbose? Looks pretty clear to me. It's Just what Java looks like. – Joe Jan 10 '13 at 14:37
  • If the objects you are using implement Comparable interface, you can Collections.Sort the collection and make the count in a single iteration. Using map to store the data is ok IMO. – Dariusz Jan 10 '13 at 14:38
  • 2
    @DariuszWawer it is one iteration anyway, sorting won;t make a difference here. – NimChimpsky Jan 10 '13 at 14:41
  • @NimChimpsky my bad; I meant less overhead while searching and putting the object to the list every time. If you sort the collection you only require one put per one distinct object. – Dariusz Jan 10 '13 at 14:47
  • @Joe No good reason really. A programmer's hunch? Seemed to me like a simple piece of logic, but I couldn't figure out a better way to implement it other than the one above. – fo_x86 Jan 10 '13 at 14:47
  • @DariuszWawer that makes even less sense. If you have an answer that is better with a sort post it (I can't see how you could). – NimChimpsky Jan 10 '13 at 14:48
  • @NimChimpsky posted it. Will you be so nice to admit that I am not wrong and what I wrote makes (at least some) sense? I feel that what you wrote is a bit too strong and even mildly insulting. – Dariusz Jan 10 '13 at 15:28
  • 1
    There is mistake in this code. It should be `++numOccurrence` in `else` statement otherwise we are overwriting the number of occurrences with 1. – Jernej Jerin Jan 17 '14 at 17:27

12 Answers12

21

Now let's try some Java 8 code:

static public Map<String, Integer> toMap(List<String> lst) {
    return lst.stream()
            .collect(HashMap<String, Integer>::new,
                    (map, str) -> {
                        if (!map.containsKey(str)) {
                            map.put(str, 1);
                        } else {
                            map.put(str, map.get(str) + 1);
                        }
                    },
                    HashMap<String, Integer>::putAll);
}
static public Map<String, Integer> toMap(List<String> lst) {
    return lst.stream().collect(Collectors.groupingBy(s -> s,
                                  Collectors.counting()));
}

I think this code is more elegant.

Neuron
  • 5,141
  • 5
  • 38
  • 59
xingfe123
  • 211
  • 2
  • 3
  • Maybe I was searching for the wrong terms, but this Java 8 example was REALLY hard to find elsewhere, thanks for posting! – Jason Sep 11 '14 at 06:34
  • Just curious as to why you provide the verbose method and non-verbose? Note also that static imports of `Collectors.*` also makes this less verbose and in many situations nulls the need for the static method as it could often be tailed off an existing stream. – Brett Ryan Jan 20 '15 at 04:29
  • 3
    Even more elegant with the accumulator like this: `(map, str) -> map.merge(str, 1, (old, one) -> old+one)` – Jan Martiška May 12 '15 at 12:21
  • 1
    Collectors.counting produces a Long, so the return type should be Map – Witbrock Oct 01 '15 at 22:17
  • 1
    Adding to @Witbrock's comment: If you want a ``Map``, you could use ``Collectors.reducing(0, e -> 1, Integer::sum)`` instead of ``Collectors.counting()``. – Yurim Dec 01 '16 at 16:50
20

Check out Guava's Multiset. Pretty much exactly what you're looking for.

Unfortunately it doesn't have an addAll(Iterable iterable) function, but a simple loop over your collection calling add(E e) is easy enough.

EDIT

My mistake, it does indeed have an addAll method - as it must, since it implements Collection.

Deekshith
  • 1,544
  • 2
  • 14
  • 15
Tom McIntyre
  • 3,620
  • 2
  • 23
  • 32
14

I know this is an old question, but I've found a more elegant way in Java 8 for counting these votes, hope you like it.

Map<String, Long> map = a.getSomeStringList()
            .stream()
            .collect(Collectors.groupingBy(
                    Function.identity(),
                    Collectors.counting())
            );

Any error, just comment.

Mateus Luiz
  • 756
  • 7
  • 20
7

Check this article How to count the number of occurrences of an element in a List. For counting occurences you can use int occurrences = Collections.frequency(list, obj);.

Community
  • 1
  • 1
Vic
  • 1,778
  • 3
  • 19
  • 37
  • 5
    `Collections.frequency` is good to count some specific objects. But what OP needs is to count all objects' frequency, which makes this method very inefficient. – Hui Zheng Jan 10 '13 at 14:42
  • Ah, I wasn't aware of this method. In my case however, I'd like to count occurrences of all unique objects in the collection in a single pass. – fo_x86 Jan 10 '13 at 14:42
  • I agree, it's inefficient. It was just a suggestion of a method from the native Java library for this task. I don't know any others, that solve exactly this problem, use external ones then. – Vic Jan 10 '13 at 14:50
3

There is a nice article on counters in java here: http://www.programcreek.com/2013/10/efficient-counter-in-java/ it focuses more on efficiency than elegance though.

The winner was this:

HashMap<String, int[]> intCounter = new HashMap<String, int[]>();
for (int i = 0; i < NUM_ITERATIONS; i++) {
    for (String a : sArr) {
        int[] valueWrapper = intCounter.get(a);

        if (valueWrapper == null) {
            intCounter.put(a, new int[] { 1 });
        } else {
            valueWrapper[0]++;
        }
    }
}
Thomas Ahle
  • 30,774
  • 21
  • 92
  • 114
2

It's not that verbose for Java. You can use TObjectIntHashMap:

public <T> TObjectIntHashMap<T> countOccurrences(Iterable<T> list) {
    TObjectIntHashMap<T> counts = new TObjectIntHashMap<T>();
    for (T obj : list) counts.adjustOrPut(obj, 1, 1);
    return counts;
}
Community
  • 1
  • 1
Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
1

Please refer the below solution to count every element in collections.

For Integer Value:

List<Integer> list = new ArrayList<Integer>();
list.add(3);
list.add(2);
list.add(5);
list.add(1);
list.add(8);
list.add(0);
list.add(2);
list.add(32);
list.add(72);
list.add(0);
list.add(13);
list.add(32);
list.add(73);
list.add(22);
list.add(73);
list.add(73);
list.add(21);
list.add(73);

HashSet<Integer> set = new HashSet<>();

for (int j = 0; j < list.size(); j++) {
    set.add(list.get(j));
}

Iterator<Integer> itr = set.iterator();
while (itr.hasNext()) {
    int a = itr.next();
    System.out.println(a + " : " + Collections.frequency(list, a));
}

Output:

0 : 2
32 : 2
1 : 1
2 : 2
3 : 1
5 : 1
21 : 1
22 : 1
8 : 1
72 : 1
73 : 4
13 : 1

For String Value:

List<String> stringList = new ArrayList<>();
stringList.add("ABC");
stringList.add("GHI");
stringList.add("ABC");
stringList.add("DEF");
stringList.add("ABC");
stringList.add("GHI");

HashSet<String> setString = new HashSet<>();

for (int j = 0; j < stringList.size(); j++) {
    setString.add(stringList.get(j));
}

Iterator<String> itrString = setString.iterator();
while (itrString.hasNext()) {
    String a = itrString.next();
    System.out.println(a + " :::  " + Collections.frequency(stringList, a));
}

Output:

ABC :::  3
DEF :::  1
GHI :::  2
Community
  • 1
  • 1
Vinod
  • 81
  • 3
1

I'm surprised no one offered this simple and readable solution. You could just use Map#getOrDefault().

 public Map<Object, Integer> countOccurrences(Collection<Object> list){
      Map<Object, Integer> occurrenceMap = new HashMap<Object, Integer>();
      for(Object obj: list){
          occurrenceMap.put(obj, occurrenceMap.getOrDefault(obj, 0) + 1);
      }
      return occurrenceMap;
 }

It solves exactly the problem you have and removes the clunky if..else.

Marko
  • 733
  • 8
  • 21
0

As a response to discussion with @NimChimpsky here is an alternative and faster - which I am attempting to prove - method of counting which uses a sorted collection. Depending on the number of elements and the "sortFactor" (see code) the speed difference varies, but for large amounts of objects in Run environment (not debug) my method has a 20-30% speed increase in relation to default method. Here is a simple test class for both methods.

public class EltCountTest {

    final static int N_ELTS = 10000;

    static final class SampleCountedObject implements Comparable<SampleCountedObject>
    {
        int value = 0;

        public SampleCountedObject(int value) {
            super();
            this.value = value;
        }

        @Override
        public int compareTo(SampleCountedObject o) {
            return (value == o.value)? 0:(value > o.value)?1:-1; // just *a* sort
        }

        @Override
        public int hashCode() {
            return value;
        }

        @Override
        public boolean equals(Object obj) {
            if (obj instanceof SampleCountedObject) {
                return value == ((SampleCountedObject)obj).value;
            }
            return false;
        }

        @Override
        public String toString() {
            return "SampleCountedObject("+value+")";
        }
    }

    /**
     * * @param args
     */
    public static void main(String[] args) {
        int tries = 10000;
        int sortFactor = 10;
        Map<SampleCountedObject, Integer> map1 = null;
        Map<SampleCountedObject, Integer> map2 = null;

        ArrayList<SampleCountedObject> objList = new ArrayList<EltCountTest.SampleCountedObject>(N_ELTS);

        for (int i =0, max=N_ELTS/sortFactor; i<max; i++){
            for (int j = 0; j<sortFactor; j++) {
                objList.add(new SampleCountedObject(i));
            }
        }

        long timestart = System.nanoTime();
        for (int a=0; a< tries; a++) {
            map1 = method1(objList);
        }
        System.out.println();
        long timeend1 = System.nanoTime();
        System.out.println();

        for (int a=0; a< tries; a++) {
            map2 = metod2(objList);
        }
        long timeend2 = System.nanoTime();
        System.out.println();


        long t1 = timeend1-timestart;
        long t2 = timeend2-timeend1;
        System.out.println("\n        org count method=["+t1+"]\nsorted collection method=["+t2+"]"+
                 "\ndiff=["+Math.abs(t1-t2)+"] percent=["+(100d*t2/t1)+"]");

        for (SampleCountedObject obj: objList) {
            int val1 = map1.get(obj);
            int val2 = map2.get(obj);
            if (val1 != val2) {
                throw new RuntimeException("val1 != val2 for obj "+obj);
            }
        }
        System.out.println("veryfy OK");

    }

    private static Map<SampleCountedObject, Integer> method1(ArrayList<SampleCountedObject> objList) {
        Map<SampleCountedObject, Integer> occurenceMap = new HashMap<SampleCountedObject, Integer>();

        for(SampleCountedObject obj: objList){
             Integer numOccurrence = occurenceMap.get(obj);
             if(numOccurrence == null){
                 occurenceMap.put(obj, 1);
             } else {
                 occurenceMap.put(obj, ++numOccurrence);
             }
        }
        return occurenceMap;
    }

    private static Map<SampleCountedObject, Integer> metod2(ArrayList<SampleCountedObject> objList) {
        Map<SampleCountedObject, Integer> occurenceMap = new HashMap<SampleCountedObject, Integer>();
        int count = 0;
        Collections.sort(objList);
        SampleCountedObject prevObj = objList.get(0);

        for(SampleCountedObject obj: objList){
            if (!obj.equals(prevObj)) {
                occurenceMap.put(prevObj, count);
                count = 1;
            } else {
                count ++;
            }
            prevObj = obj;
        }
        occurenceMap.put(prevObj, count);
        return occurenceMap;
    }
}

Note that I also verify that the results are the same and I do it after printing the test results.

What I found interesting is that in Debug run my method is fairly slower than the original (10-20%, again - depending on the number of elements in collection).

Dariusz
  • 21,561
  • 9
  • 74
  • 114
  • You are generating an already-sorted list, so you aren't even getting to see the real perfomance hit of doing a sort. For the first method, time complexity is O(n). Yours is O(n) PLUS the sorting method, which will probably be O(n log n), or as bad as O(n^2). Basically all you've done (besides the sort) is replace the hashmap get() with equals(). I really don't think that gives much savings (they are both O(1) ) unless you have data specifically optimized for your method (so probably never in the real world) – tobii Oct 26 '15 at 16:26
0

There is a method in commons-collections: CollectionUtils.getCardinalityMap which is doing exactly this.

panurg
  • 309
  • 4
  • 11
0

You can use a Bag from Eclipse Collections.

Iterable<Object> iterable = Arrays.asList("1", "2", "2", "3", "3", "3");
MutableBag<Object> counts = Bags.mutable.withAll(iterable);

Assertions.assertEquals(1, counts.occurrencesOf("1"));
Assertions.assertEquals(2, counts.occurrencesOf("2"));
Assertions.assertEquals(3, counts.occurrencesOf("3"));
Assertions.assertEquals(0, counts.occurrencesOf("4"));

The withAll method on the MutableBagFactory interface takes an Iterable as a parameter, and returns a MutableBag. The method occurrencesOf on MutableBag returns an int, which is the number of occurrences of an element. Unlike a Map, a Bag will not return null if it does not contain an element. Instead, the occurrencesOf method will return 0.

MutableBag is a Collection, so it has an addAll method that takes a Collection as a parameter.

counts.addAll(Arrays.asList("4", "4", "4", "4"));
Assertions.assertEquals(4, counts.occurrencesOf("4"));

MutableBag also has an addAllIterable method which takes Iterable.

Stream<Object> stream = Stream.of("1", "2", "3", "4");
counts.addAllIterable(stream::iterator);

Assertions.assertEquals(2, counts.occurrencesOf("1"));
Assertions.assertEquals(3, counts.occurrencesOf("2"));
Assertions.assertEquals(4, counts.occurrencesOf("3"));
Assertions.assertEquals(5, counts.occurrencesOf("4"));

The HashBag implementation in Eclipse Collections is backed by an ObjectIntHashMap, so the int counts are not boxed. There is a blog with more information about the Bag type in Eclipse Collections located here.

Note: I am a committer for Eclipse Collections

Donald Raab
  • 6,458
  • 2
  • 36
  • 44
-1

Java is a verbose language, I don't think there is a simpler way to achieve that, unless using 3rd-party library or waiting for Java 8's Lambda Expression.

Hui Zheng
  • 10,084
  • 2
  • 35
  • 40