1

I am working on an interview question where I need to return the k most frequent elements given a non-empty array of integers..

Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]

Here is my code:

  public static void main(String[] args) {
    System.out.println(topKFrequent(new int[] {1, 1, 1, 2, 2, 3, 3}, 1));
  }

  private static List<Integer> topKFrequent(int[] nums, int k) {
    // freq map
    Map<Integer, Integer> freq = new HashMap<Integer, Integer>();
    for (int n : nums) {
      freq.put(n, freq.getOrDefault(n, 0) + 1);
    }
    // bucket sort on freq
    List<Integer>[] bucket = new LinkedList[nums.length + 1];
    for (int i = 0; i < bucket.length; i++)
      bucket[i] = new LinkedList<>();
    for (int key : freq.keySet()) {
      bucket[freq.get(key)].add(key);
    }
    // gather result
    List<Integer> res = new ArrayList<>();
    for (int i = bucket.length - 1; i >= 0; i--) {
      res.addAll(bucket[i]);
      if (res.size() >= k)
        break;
    }
    return res;
  }

I wanted to check whether there is any improvement/optimization can be made in this code? I am confuse in the case when more than one numbers with the same frequency exists. What should we do in that case?

flash
  • 1,455
  • 11
  • 61
  • 132
  • I'm not sure if by optimisation you refer to performance or to the amount of code written, but if the later is the case the sort could be done on the map without the need of the linked list. Here is an example [Sort a Map by values](https://stackoverflow.com/questions/109383/sort-a-mapkey-value-by-values) –  Nov 03 '18 at 07:32
  • basically I am looking for both... Sorting will take 0(n*log n) but this code takes o(n) – flash Nov 03 '18 at 07:35
  • `case when more than one numbers with the same frequency`- In this situation, there could be more than 1 solution. Check to see if the interview question has mentioned anything for tie-breaker. – nice_dev Nov 03 '18 at 07:46
  • I also feel you could use a priority queue to do this. – nice_dev Nov 03 '18 at 07:47
  • See https://www.quora.com/What-is-the-best-algorithm-to-find-the-maximum-repeating-element-in-an-array – Saeed Hassanvand Nov 03 '18 at 07:59
  • _I am confuse in the case when more than one numbers with the same frequency exists._ <-- That is a question for the person who set the interview question. We cannot answer that. – Joe C Nov 03 '18 at 08:10

1 Answers1

0

This part

List<Integer>[] bucket = new LinkedList[nums.length + 1];
for (int i = 0; i < bucket.length; i++)
  bucket[i] = new LinkedList<>();
for (int key : freq.keySet()) {
  bucket[freq.get(key)].add(key);
}

I would write it with one loop

List<Integer>[] bucket = new LinkedList[nums.length + 1];
for (int key : freq.keySet()) {
  List<Integer> currentBucket = bucket[freq.get(key)];
  if (currentBucket == null) {
      currentBucket = new LinkedList<Integer>();
      bucket[freq.get(key)] = currentBucket;
  }
  currentBucket.add(key);
}

In this case the last part

List<Integer> res = new ArrayList<>();
for (int i = bucket.length - 1; i >= 0; i--) {
  res.addAll(bucket[i]);
  if (res.size() >= k)
    break;
}

needs to skip null values

List<Integer> res = new ArrayList<>();
for (int i = bucket.length - 1; i >= 0; i--) {
  if (bucket[i] == null) {
      continue;
  }
  res.addAll(bucket[i]);
  if (res.size() >= k)
    break;
}
  • I see.. Also do I need `LinkedList` here or I can use `ArrayList`? – flash Nov 03 '18 at 08:05
  • when the size of the list is not specified at the initialisation time on average the `LinkedList` has better performance when adding elements to it [When to use LinkedList over ArrayList in Java?](https://stackoverflow.com/questions/322715/when-to-use-linkedlist-over-arraylist-in-java). I would use `LinkedList` for this use case –  Nov 03 '18 at 08:15