0

Im using the following code to create a hashmap and then sort the values in the hashmap by using a treemap and a comparator. However, the output is rather unexpected. So any thoughts as to what Im doing wrong would be helpful

Code

public static void main(String[] args) {
    System.out.println("Most freq"+mostFreq(" i me hello hello hello me"));
}


public static String[] mostFreq(String str){

    if ((str==null)||( str.trim().equalsIgnoreCase("")))
        return null;

    String[] arr = new String[10];

    String[] words= str.split(" ");

    Map <String,Integer> map = new HashMap<String,Integer>();

    for (String word :words)
    { 
        int count =0;
        if (map.containsKey(word))
        {     
            count= map.get(word);
            map.put(word, count+1);
        }             
        else
            map.put(word, 1);
    }

    MyComparator comp= new MyComparator(map);
    Map<String,Integer> newMap= new TreeMap(comp);
    newMap.putAll(map);
    Iterator it= newMap.entrySet().iterator();
    while (it.hasNext())
    {
        Map.Entry pairs = (Map.Entry) it.next();
        System.out.println("Key  "+pairs.getKey()+"-- value"+pairs.getValue());
    }

    return arr;
}

Here's the comparator

package samplecodes;

import java.util.Comparator;
import java.util.Map;

public class MyComparator implements Comparator {

    Map map;

    public MyComparator(Map map){
        this.map=map;
    }

    @Override
    public int compare(Object o1, Object o2) {
        return ((Integer)map.get(o1) >(Integer)map.get(o2)? (Integer)map.get(o1):(Integer)map.get(o2));
    }

}

And the output is of the form

me-2
hello-3
i-3
Adrian Shum
  • 38,812
  • 10
  • 83
  • 131
seeker
  • 6,841
  • 24
  • 64
  • 100
  • Your code doesn't produce this output. Are you sure it is the one you are using? – Pshemo Oct 25 '13 at 02:10
  • 1
    Also maybe take a look at [how-to-sort-a-mapkey-value-on-the-values-in-java](http://stackoverflow.com/questions/109383/how-to-sort-a-mapkey-value-on-the-values-in-java) – Pshemo Oct 25 '13 at 02:12
  • 1
    lots of bad smell in your code: please proper add generics type param for your Map, Iterator etc. In `mostFreq()` you are returning `arr` which is simply an empty String array that has never been touched in the method. I am also replying in an answer about your logic problem – Adrian Shum Oct 25 '13 at 03:12
  • @AdrianShum: That was a method under construction . I was testing out some of the data structures before I finished the coding the problem and thats when I posted the question. Good catch though! – seeker Oct 25 '13 at 03:50
  • As I said in my other comment, you are using the tools /data structure in wrong way. The way you use TreeMap is simply not how it suppose to use. – Adrian Shum Oct 25 '13 at 06:06

3 Answers3

3

Please check the JavaDoc of compare: You do not return the bigger value, but -1 for o1 < o2, 0 for o1 = o2 and 1 for o1 > o2. So you could write:

@Override
public int compare(Object o1, Object o2) {
    return ((Integer) map.get(o1)).compareTo((Integer) map.get(o2);
}
Robin Krahl
  • 5,268
  • 19
  • 32
  • Ah. My bad. But I think doing it manually ie . `if(o1==o2) return 0; return (Integer)map.get(o1) >(Integer)map.get(o2)? 1:-1;` is a better option when you need frequency of all words. using `compareTo` seems to supress words of the same frequency. – seeker Oct 25 '13 at 02:21
  • You are right; I did not think of that. But in other cases, it is nice to delegate the comparison to other classes. ;) – Robin Krahl Oct 25 '13 at 02:23
  • @KodeSeeker I think your logic in comment is wrong. IIRC `compare()` should be "symmetric" http://docs.oracle.com/javase/6/docs/api/java/util/Comparator.html#compare%28T,%20T%29 : sgn(compare(a,b)) should return -sgn(compare(b,a)). However your logic doesn't: if value in map for key `o1` and `o2` is the same, both compare(a,b) and compare(b,a) is returning 1, which is violating the contract. – Adrian Shum Oct 25 '13 at 03:05
1

The Java Doc of TreeMap clearly states that:

A Red-Black tree based NavigableMap implementation. The map is sorted according to the natural ordering of its keys

we should not violate this rule by using TreeMap to sort by values.

However to sort by values, we can do the following:

  1. Create a LinkedList of entries of the map
  2. using Collection.sort to sort the entries
  3. Inserting the sorted entries to a LinkedHashMap: keeps the keys in the order they are inserted, which is currently sorted on natural ordering.
  4. Return the LinkedHashMap as the sorted map.

     public static <K extends Comparable,V extends Comparable> Map<K,V> sortByValues(Map<K,V> map){
        List<Map.Entry<K,V>> entries = new LinkedList<Map.Entry<K,V>>(map.entrySet());
    
        Collections.sort(entries, new Comparator<Map.Entry<K,V>>() {
    
            @Override
            public int compare(Entry<K, V> o1, Entry<K, V> o2) {
                return o1.getValue().compareTo(o2.getValue());
            }
        });
    
    
        Map<K,V> sortedMap = new LinkedHashMap<K,V>();
    
        for(Map.Entry<K,V> entry: entries){
            sortedMap.put(entry.getKey(), entry.getValue());
        }
    
        return sortedMap;
    }
    
    }
    

Reference: Sorting Map by value

Sage
  • 15,290
  • 3
  • 33
  • 38
0

What you are doing is really a misuse of tools.

I believe what you need to do is:

  1. Have a list/array of input words (still fine that you get it by splitting the input string)
  2. Create a Map to store the word as key, and frequency as value
  3. Have a collection of unique words, then sort the collection base on the the frequency
  4. When you are doing the output, traverse the sorted unique word list, for each element, get the frequency from the frequencyMap, and output the word + frequency.

Of course you can still make use of something like a TreeSet and use frequency as key, but you should have list of words as the value of this map (aka Multi-Map), instead of writing a problematic comparator which do not follow the contract of Comparator: http://docs.oracle.com/javase/6/docs/api/java/util/Comparator.html#compare%28T,%20T%29 Both your original implementation and the one in comment of one of the answers does not comply with the rule of sgn(compare(x, y)) == -sgn(compare(y, x)) for all x and y (The original one is even worse).

some code snippet just for giving you hints:

List<String> words = ....;
Map<String, Integer> wordFrequencyMap = new HashMap<String, Integer>();
// iterate words and update wordFrequencyMap accordingly
List<String> uniqueWords = new ArrayList<String>(new HashSet<String>(words));
Collections.sort(uniqueWords, new WordFrequencyComparator<String>(wordFrequencyMap));
for (String w : uniqueWords) {
  System.out.println("word : " + w + "  frequency : " + wordFrequencyMap.get(w));
}

The missing part shouldn't be anything difficult.

Adrian Shum
  • 38,812
  • 10
  • 83
  • 131