3

What would be the fastest way to get the common values from all the sets within an hash map?

I have a

Map<String, Set<String>>

I check for the key and get all the sets that has the given key. But instead of getting all the sets from the hashmap, is there any better way to get the common elements (value) from all the sets?

For example, the hashmap contains,

abc:[ax1,au2,au3]
def:[ax1,aj5]
ijk:[ax1,au2]

I want to extract the ax1 and au2 alone, as they are the most common values from the set.

Paul Vargas
  • 41,222
  • 15
  • 102
  • 148
Balaram26
  • 1,349
  • 3
  • 15
  • 32
  • 4
    Use a `Map` where you will store the `String`s on each `Set` as key and use the value as counter, then traverse all the elements in this new `Map` to get the key with max counter value. – Luiggi Mendoza Jul 11 '13 at 02:23
  • 1
    By "common" do you mean "used more than once"? If not, the 4 most common values in your example are all of them. – Bohemian Jul 11 '13 at 02:40
  • Thanks for the reply. I used the map to store the key and loop through each set and get the occurrence. – Balaram26 Jul 11 '13 at 03:06
  • By "Common" i mean the values that are repeated more than certain number of times. – Balaram26 Jul 11 '13 at 03:26
  • There's no library that does it. You seem to be asking us to write your program for you. Try something, if you get stuck, post what you've tried and where you're having a problem – Bohemian Jul 11 '13 at 06:13

1 Answers1

3

note: not sure if this is the fastest, but this is one way to do this.

First, write a simple method to extract the frequencies for the Strings occurring across all value sets in the map. Here is a simple implementation:

Map<String, Integer> getFrequencies(Map<String, Set<String>> map) {
    Map<String, Integer> frequencies = new HashMap<String, Integer>();
    for(String key : map.keySet()) {
        for(String element : map.get(key)) {
            int count;
            if(frequencies.containsKey(element)) {
                count = frequencies.get(element);
            } else {
                count = 1;
            }
            frequencies.put(element, count + 1);
        }
    }
    return new frequencies;
}

You can simply call this method like this: Map<String, Integer> frequencies = getFrequencies(map)

Second, in order to get the most "common" elements in the frequencies map, you simply sort the entries in the map by using the Comparator interface. It so happens that SO has an excellent community wiki that discusses just that: Sort a Map<Key, Value> by values (Java). The wiki contains multiple interesting solutions to the problem. It might help to go over them.

You can simply implement a class, call it FrequencyMap, as shown below.

Have the class implement the Comparator<String> interface and thus the int compare(String a, String b) method to have the elements of the map sorted in the increasing order of the value Integers.

Third, implement another method, call it getCommon(int threshold) and pass it a threshold value. Any entry in the map that has a frequency value greater than threshold, can be considered "common", and will be returned as a simple List.

class FrequencyMap implements Comparator<String> {

    Map<String, Integer> map;
    public FrequencyMap(Map<String, Integer> map) {
        this.map = map;
    }

    public int compare(String a, String b) {
        if (map.get(a) >= map.get(b)) {
            return -1;
        } else {
            return 1;
        } // returning 0 would merge keys
    }

    public ArrayList<String> getCommon(int threshold) {
        ArrayList<String> common = new ArrayList<String>();
        for(String key : this.map.keySet()) {
            if(this.map.get(key) >= threshold) {
                common.add(key);
            }
        }
        return common;
    }

    @Override public String toString() {
        return this.map.toString();
    }
}

So using FrequencyMap class and the getCommon method, it boils down to these few lines of code:

    FrequencyMap frequencyMap = new FrequencyMap(frequencies);
    System.out.println(frequencyMap.getCommon(2));
    System.out.println(frequencyMap.getCommon(3));
    System.out.println(frequencyMap.getCommon(4));

For the sample input in your question this is the o/p that you get:

// common values
[ax1, au6, au3, au2]
[ax1, au2]
[ax1]

Also, here is a gist containing the code i whipped up for this question: https://gist.github.com/VijayKrishna/5973268

Community
  • 1
  • 1
vijay
  • 2,646
  • 2
  • 23
  • 37
  • Thanks for the reply.But i have done a similar approach. What i did was, Have a Map qureyMap to store the value and its frequencies. Get the set that contain same key from the hashmap. Loop through each set,if the vale is new,then add it as new to queryMap,if its aldready present,then increase the counter. This method works fine.but is it best way /fastest way to do?? because i have 100000 maps and i perform this operation on each map once they are generated. – Balaram26 Jul 11 '13 at 08:15
  • honestly do not know a different way this. i believe that hashmaps are pretty efficient. You say that you have to run this for 100000 maps, but how many entries do those maps contain on an average and how many values are there in each entry? if you can give me rough numbers, i can maybe run a few performance numbers for you. Also, do update your question based on these requirements. It will help you get better answers. And it would be nice to get an upvote for the effort :) – vijay Jul 11 '13 at 08:22