0

I have a hashmap storing the number of occurrences of character in a text. I am trying to print out the top 3 occurrences, but it is printing incorrectly.

int max = 1000000000;
for (int i = 1; i <= 3; i++) {
    for (Character key : list.keySet()) {
        if (list.get(key) < max) {
            max = list.get(key);
            System.out.println(i + ": " + key + " " + list.get(key));
            break;
        }
    }
}
arshajii
  • 127,459
  • 24
  • 238
  • 287
f3d0r
  • 541
  • 3
  • 12
  • 27
  • OK, I changed it to a LinkedHashMap, but that still doesn't fix the problem. – f3d0r Jan 27 '15 at 15:13
  • Are you trying to look at all the values, find the top three and print out their values? – Ascalonian Jan 27 '15 at 15:13
  • can you give an example of how it's printing currently? – Llanilek Jan 27 '15 at 15:13
  • Yes, that is what I am trying to do. – f3d0r Jan 27 '15 at 15:13
  • All the values of the Hashmap printed out: t 12098 h 8692 e 17900 p 2130 r 7976 o 11459 j 45 c 2748 g 2485 u 4960 n 8377 b 1891 s 8299 x 247 f 2779 a 9863 k 1285 i 8816 l 5908 d 5014 m 4259 y 3245 w 3049 v 623 z 56 q 207 – f3d0r Jan 27 '15 at 15:14
  • Use a `TreeMap` to sort the `HashMap` and just get the top three results. Check out this post for that: http://stackoverflow.com/questions/109383/how-to-sort-a-mapkey-value-on-the-values-in-java and/or this http://stackoverflow.com/questions/8119366/sorting-hashmap-by-values – Ascalonian Jan 27 '15 at 15:14
  • Top 3 printing out currently (incorrect): 1: t 12098 2: h 8692 3: p 2130 – f3d0r Jan 27 '15 at 15:14

4 Answers4

8

With Java 8 you can use the code below(*):

List<Entry<Character, Integer>> top3 = map.entrySet().stream()
                                    .sorted(comparing(Entry::getValue, reverseOrder()))
                                    .limit(3)
                                    .collect(toList());

(*) with the following imports:
import static java.util.Comparator.comparing;
import static java.util.Comparator.reverseOrder;
import static java.util.stream.Collectors.toList;

assylias
  • 321,522
  • 82
  • 660
  • 783
2

You could modify your program to this form:

for (int i = 1; i <= 3; i++) {
    int max = -1;
    Character maxKey = 'a';
    for (Character key : list.keySet()) {
        if (list.get(key) > max) {
            max = list.get(key);
            maxKey = key;
        }
    }
    System.out.println(i + ": " + maxKey + " " + max );
    list.remove(maxKey);
}
baju
  • 511
  • 5
  • 19
  • @JeanLogeart For that matter, your answer doesn't exatly shine either. A really good solution would be based on the `PriorityQueue` class, which is a natural fit for this problem (it uses the heap data structure). – Marko Topolnik Jan 27 '15 at 15:30
  • Sorting based on comparing keys in O(n log n) - this solution is O(n). I know that there is only 24 keys(number of characters) so sorting and taking top 3 will be faster in comparison to iteration 3 times. On the other hand you could eliminate 3 iteration to the single but it will be much more complicated. I do not claim that it is ideal but regarding intuition: I think it is simple. – baju Jan 27 '15 at 15:35
0

You need to sort your entries by number of occurences and get the top 3:

List<Entry<Character, Integer>> entryList = new ArrayList<>(list.entrySet());
Collections.sort(entryList, new Comparator<Entry<Character, Integer>>(){
    @Override
    public int compare(Entry<Character, Integer> e1, Entry<Character, Integer> e2) {
        return e2.getValue() - e1.getValue(); // descending order
    }
});

// now let's get the top 3
List<Character> top3 = new ArrayList<>(3);
for(Entry<Character, Integer> e : entryList) {
    top3.add(e.getValue());
    if(top3.size() == 3) {
        break;
    }
}
Jean Logeart
  • 52,687
  • 11
  • 83
  • 118
0

Here's a solution using Java 8 streams, based on the one provided by @assylias. It performs the complete task of collecting character counts from a String into a Map and selecting the top 3 entries.

import java.util.ArrayList;
import static java.util.Comparator.*;
import java.util.List;
import java.util.Map.Entry;
import static java.util.stream.Collectors.*;

public class Stream {

    public static void main(final String[] args) {
        final String text = "hello stackoverflow, let's count these character occurrences!";
        final char[] charArray = text.toCharArray();
        final List<Character> characters = new ArrayList<Character>(text.length());
        for (final char c : charArray) {
            characters.add(c);
        }

        final List<Entry<Character, Long>> top3 = characters.stream()
                .collect(groupingBy(Character::charValue, counting()))
                .entrySet().stream()
                .sorted(comparing(Entry::getValue, reverseOrder())).limit(3).collect(toList());

        System.out.println(top3);
    }
}

Output:

[e=8, c=7,  =6]
Adriaan Koster
  • 15,870
  • 5
  • 45
  • 60
  • It is detail but: If you know number of elements that you want to put to the `ArrayList` why you do not initialize `ArrayList` with proper size and get rid of resizing? – baju Jan 27 '15 at 21:46
  • I don't know the size beforehand. Strings can contain any of the >1 million UTF-8 characters but typically will contain much less. – Adriaan Koster Jan 28 '15 at 09:24
  • Hm, but in your example you are iterating over array of characters, right? So you know the size of array. As a result you could initialize `characters ` list with proper size. Am I wrong? – baju Jan 28 '15 at 09:30
  • Oh sorry, you meant the first ArrayList. Yes, that one can be sized to the length of the String but I would consider that a premature optimization which clutters the code. Probably a matter of opinion, I do recall that Sonar is indicating this as an issue lately. – Adriaan Koster Jan 28 '15 at 09:38
  • So I added the initialCapacity to please @baju and Sonar – Adriaan Koster Jan 28 '15 at 09:40