6

I have a Hashmap that links a zipcodes stored as keys and population stored as values in a hashmap.

The hashmap contains around 33k entries.

I'm trying to get the 5 highest population values from 5 zip codes and print out the 5 zip codes ASSOCIATED with the 5 highest population, but I'm having trouble understanding the algorithm of how to do it.

If it was just one, its easy but the 5 restriction is giving me some trouble.

I know to store the 5 values in an int array and I have a counter to determine when 5 of them are stored, but thats it.

Thanks

    int populatedCounter = 0;

    int[] populatedZip = new int[5];

    it = zipCodePop.entrySet().iterator();
    while (it.hasNext())
    {
        Map.Entry pairs = (Map.Entry)it.next();

        for (int i = 0; i < populatedZip.length; i++)
        {

        }
    }

}
Pangu
  • 3,721
  • 11
  • 53
  • 120
  • 3
    Are third-party libraries like Guava fair game? This could just be the one line `Ordering.natural().greatestOf(map.values(), 5)`. – Louis Wasserman Jan 30 '14 at 19:34
  • How about creating a max_heap of values; delete the max and then resize; get the next max and so on. – Prince Jan 30 '14 at 19:43
  • What are ZIP and population (what are the Map's parameters)? `Map`? – Radiodef Jan 30 '14 at 19:51
  • Is this homework? BTW see also [Finding the second highest number in array](http://stackoverflow.com/a/2615761/2891664) which can be generalized to find k highest numbers. – Radiodef Jan 30 '14 at 20:06
  • algorithm: Add first 5 values to an array, sort the array, iterate through the map until you find a value higher than the first value in the array (the lowest), replace the lowest, resort the array (in case the new value is higher than the others), continue. It'll be a little more effort to preserve the key/value pairs but it should be simple enough. – turbo Jan 30 '14 at 20:09
  • Do not sort collection initially:). Do: Find max value, then remove it from collection. Repeat action until you have desired number of found values (5 in your case). Profit. – Developer Marius Žilėnas Jan 30 '14 at 20:29

7 Answers7

9

Putting the entries of such a set into a list and sorting it is one option. But 33k elements is a number where the O(n*log(n)) complexity of sorting might already have a noticable performance impact.

One apporach would be to employ the PriorityQueue that nr4bt already mentioned (I wrote this snippet while he answered). It basically inserts all elements into a PriorityQueue that is sorted according to the values of the map entries.

import java.util.ArrayList;
import java.util.Comparator;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.PriorityQueue;

public class GreatestOfMap
{
    public static void main(String[] args)
    {
        Map<String, Integer> map = new HashMap<String, Integer>();

        map.put("zip000", 1234);
        map.put("zip001", 2345);
        map.put("zip002", 3456);
        map.put("zip003", 4567);
        map.put("zip004", 5678);
        map.put("zip005", 6789);
        map.put("zip006", 123);
        map.put("zip007", 234);
        map.put("zip008", 456);
        map.put("zip009", 567);
        map.put("zip010", 7890);
        map.put("zip011", 678);
        map.put("zip012", 789);
        map.put("zip013", 890);

        int n = 5;
        List<Entry<String, Integer>> greatest = findGreatest(map, 5);
        System.out.println("Top "+n+" entries:");
        for (Entry<String, Integer> entry : greatest)
        {
            System.out.println(entry);
        }
    }

    private static <K, V extends Comparable<? super V>> List<Entry<K, V>> 
        findGreatest(Map<K, V> map, int n)
    {
        Comparator<? super Entry<K, V>> comparator = 
            new Comparator<Entry<K, V>>()
        {
            @Override
            public int compare(Entry<K, V> e0, Entry<K, V> e1)
            {
                V v0 = e0.getValue();
                V v1 = e1.getValue();
                return v0.compareTo(v1);
            }
        };
        PriorityQueue<Entry<K, V>> highest = 
            new PriorityQueue<Entry<K,V>>(n, comparator);
        for (Entry<K, V> entry : map.entrySet())
        {
            highest.offer(entry);
            while (highest.size() > n)
            {
                highest.poll();
            }
        }

        List<Entry<K, V>> result = new ArrayList<Map.Entry<K,V>>();
        while (highest.size() > 0)
        {
            result.add(highest.poll());
        }
        return result;
    }
}
Marco13
  • 53,703
  • 9
  • 80
  • 159
  • @TokugawaIeysu This solution does _not_ preform better than adding-then-sorting all the values. Take a look at `PriorityQueue`'s [documentation](http://docs.oracle.com/javase/7/docs/api/java/util/PriorityQueue.html), `offer()` and `add()` are `O(log(n))`, and here you're calling them `n` times ... resulting in exactly the same `O(n log(n))` complexity of my solution, but much harder to understand. – Óscar López Jan 30 '14 at 21:25
  • 1
    For the PriorityQueue, the `n` to which the documentation refers is never greater than the (constant) maximum number of elments that it contains (that is, 5 in this case). So the overall running time is O(n*log(5)) = O(n). In contrast to that, when *sorting* `n` elements, the running time is O(n*log(n)) for a possibly large `n` (33k, in this case). However, I did not make a benchmark to see for which `n` there really is a noticable difference, it's just the asymptotic complexity that is lower. – Marco13 Jan 30 '14 at 21:32
  • thanks Marco, I was able to get the population!...I think?...there's too many entries to confirm if it was correct, but I believe it is, so thanks again! – Pangu Jan 30 '14 at 21:38
5

Try this, using standard methods and assuming that the population count is stored as Integers in the HashMap:

List<Integer> list = new ArrayList<Integer>(zipCodePop.values());
Collections.sort(list, Collections.reverseOrder());
List<Integer> top5 = list.subList(0, 5);
Óscar López
  • 232,561
  • 37
  • 312
  • 386
  • 3
    @KevinWorkman Just curious what makes you think this is homework? Just a guess? – Radiodef Jan 30 '14 at 19:48
  • @Radiodef The "find X max in this data structure" is a pretty standard homework assignment, and that coupled with the fact that the OP doesn't seem to know how to even start approaching the problem makes me highly suspect this is homework. I could be wrong, but either way, I would argue that the answer to "I don't know how to do this" is to show the person how to work through it themselves, instead of giving them 20 different possible solutions they can copy-paste without understanding. – Kevin Workman Jan 30 '14 at 19:53
  • I understand how to get the "max" populated zip, but that just gets updated to get the max value every time – Pangu Jan 30 '14 at 19:59
2

public class CheckHighiestValue { public static void main(String... s) {

    HashMap<String, Integer> map = new HashMap<String, Integer>();

    map.put("first", 10000);
    map.put("second", 20000);
    map.put("third", 300);
    map.put("fourth", 800012);
    map.put("fifth", 5000);
    map.put("sixth", 30012);
    map.put("seventh", 1234);
    map.put("eighth", 45321);
    map.put("nineth", 5678);

    Set<Entry<String, Integer>> set = map.entrySet();

    List<Entry<String, Integer>> list = new ArrayList<Entry<String, Integer>>(
            set);

    Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {

        @Override
        public int compare(Entry<String, Integer> o1,
                Entry<String, Integer> o2) {

            return o2.getValue().compareTo(o1.getValue());
        }

    });
    System.out.println(list.subList(0, 5));
}

}

Dharma
  • 31
  • 2
  • We can get the last 5 five digit values in the list using comparator with list.subList(index,index); – Dharma Sep 13 '15 at 17:36
1

PriorityQueue would help too, and also a nice topic about how to get top k from a list, you can check this link

PriorityQueue<Integer> p = new PriorityQueue<Integer>(5);

int[] a = new int[]{3,5,10,1,23,42,66,1333,545,110};

for (int i : a){
    p.add(i);
    if (p.size() > 5){
        p.poll();
    }
}

//output will be highest 5, [42, 66, 110, 1333, 545]

You can have O(n log(k)) time complexity // k is your top value count.

Orhan Obut
  • 8,756
  • 5
  • 32
  • 42
  • thanks for the information, but I had to change my original description. I actually need to find the 5 highest values (population), but I need to print out the zip codes associated with those 5 highest values, which is why I stored them in a hashmap – Pangu Jan 30 '14 at 20:09
  • just get the top 5 by comparing with the population and get the zip code by using your hashmap afterwards – Orhan Obut Jan 30 '14 at 20:12
1

This is something i made and hopefully provides you something that you want to use.

public class TopsCollection { 

private static Map<String, Integer> collectors = new HashMap<>();

public TopsCollection() {
}

public void add(String playerName, int score) {
    collectors.put(playerName, score);
}

public void clearCollectors() {
    synchronized (collectors) {
        collectors.clear();
    }
}

public List<Map.Entry<String, Integer>> getTops() {
    return collectors.entrySet().stream().sorted(comparing(Map.Entry::getValue, reverseOrder())).limit(5).collect(toList());
}

public int getTopByName(String name) {
    for (int i = 0; i < getTops().size(); i++) {
        if (getTops().get(i).getKey().contains(name)) {
            return i;
        }
    }
    return 0;
}

getTopByName allows you to get the top place of the specified name.

Manolis
  • 11
  • 1
0

How would you do this without a computer, with just a piece of paper and a pencil? Pretend you had a stack of index cards that had numbers on them, and it was your job to find the 5 highest numbers. How would you do that? Write down steps that somebody else could follow to achieve the goal, and when you have those steps written out, you'll have an algorithm that you can start thinking about implementing with code.

You say that a single maximum is easy, so do it exactly like you would with a single maximum, but keep track of the five maximums instead. An array of maximums might be helpful here.

Kevin Workman
  • 41,537
  • 9
  • 68
  • 107
  • Emulating paper and pencil is rarely a good idea when implementing software. This is a trivial sort problem and following your approach, Tokugawa would have implemented a sorting algorithm. Why should he do that, when there are so many ways using the standard API to sort data structures? – jarnbjo Jan 30 '14 at 19:54
  • Because I would bet this is homework, designed specifically so the OP would understand working through an algorithm based on previous work (finding a single max). And what makes you think that "my approach" encourages a sorting algorithm? You *could* use a sorting algorithm for maximum efficiency, but the basic brute-force approach is probably what the instructor is looking for. The other copy-paste solutions rob the OP of the learning experience of working through a problem, which is the whole point of assignments like this. – Kevin Workman Jan 30 '14 at 19:56
  • And for more information on answering homework questions on StackOverflow, please see: http://meta.stackexchange.com/questions/10811/how-do-i-ask-and-answer-homework-questions/10812#10812 – Kevin Workman Jan 30 '14 at 20:08
0

Using Streams

int[] populatedZip = map.entrySet().parallelStream()
            .sorted(Map.Entry.<String, Integer>comparingByValue())
            .limit(5)
            .mapToInt(entry -> entry.getValue())
            .toArray();
Venkatesh
  • 577
  • 1
  • 7
  • 16