-3

Given a file (which can be considered as a String with comma delimiter for the complexity of the question) of usernames and a value k, find top k usernames (with number of logins) who logged into the system the most.

For example - Input:

  • User (String) = user1, user4, user2, user1, user3, user1, user2, user3

  • k (int) = 2

Output:

  • user1 (3)
  • user2 (2)
  • user3 (2)

Both user2 and user3 should be included since both has same number of logins

Write a java method to find the output with best time and space complexity. You can assume that the content of the file can be stored in a String, no need to use Hadoop or big data methods.

This was asked in one of my phone interview, I gave a few answers using Linear search and they weren't impressed by it, they questioned my Space complexity.

My Solution

I lost the code after the interview, but I did something like this,

  • Split the String into a String array of usernames
  • Create a TreeMap and iterate through the arrays, if the key is existing, then get the value and increment one, if not create a new key with value 1. (The interviewer was not a big fan of this approach, can we do it better in linear approach?)
  • Use a comparator to sort the list by values i.e, largest value on top of the list (He's not convinced by that approach either, he want to combine this time with the previous step into something better)
  • Iterate through the map to get the top k elements in a List (Map.Entry) (He said define No for that because it will affect the space complexity)

Not sure how I can improve on my answer. Any suggestions? Please advise.

Sam
  • 375
  • 1
  • 2
  • 13

2 Answers2

1

Create a map for counting the users, after that you need to sort the map based on the values. See this answer on sorting maps by values. Sort a Map<Key, Value> by values (Java)

After that you can take out the top k entries. You need to take care the duplicate values.

Integer prevVal = null;
int i = 0;
for(Entry<String,Integer> e : sortedMap.entrySet()){
  Integer value = e.getValue();
  if( ! value.equals(prevVal)){
    i++;
  }
  prevVal =  value;
  if(i > k) break;
  list.add(e);
 }
Community
  • 1
  • 1
Satyendra Kumar
  • 357
  • 2
  • 13
  • It's not always top 3 isn't it? That's the challenging part. Here the k value is 2, but we are actually taking 3 outputs. In some cases, k value might be 3, and the actual output might be 20, 50 or even 100. That's little tricky. – Sam Jun 05 '16 at 16:43
  • You need to iterate the sorted map and get top k entries. – Satyendra Kumar Jun 05 '16 at 16:55
  • 1
    @Sam So you take out first `k` entries, plus any extra entries that has same value as last entry. Not that tricky. *(SatyendraKumar, you should change last paragraph to say something like that)* – Andreas Jun 05 '16 at 17:28
  • @Andreas - Yes that code makes a lot of sense. Yes I proposed something similar to that, but he questioned my time and space complexity, and asked if I can do any better. See my solution in the description. Is there a better solution for that question? – Sam Jun 05 '16 at 18:44
0

Comments to your solution:

Split the String into a String array of usernames

This requires loading the entire file into memory. You should use the values as you read them. Space complexity

Create a TreeMap and iterate through the arrays, if the key is existing, then get the value and increment one, if not create a new key with value 1. (The interviewer was not a big fan of this approach, can we do it better in linear approach?)

Use a HashMap, not a TreeMap. You don't care about the order of usernames.

Don't use Map<String, Integer> since that requires unboxing and reboxing when incrementing the counter. Use Map<String, int[]> or Map<String, Counter> where int[] has size 1 and Counter is a simple class with an int field.

Use a comparator to sort the list by values i.e, largest value on top of the list (He's not convinced by that approach either, he want to combine this time with the previous step into something better)

You should sort map entries, not values, otherwise you lose the usernames.

You could try doing it live while building the Map, but I'd think that would be less efficient.

Iterate through the map to get the top k elements in a List (Map.Entry) (He said define No for that because it will affect the space complexity)

You iterate the sorted array of map entries, stopping after k+x where x is the number of extra entries with the same value as array[k-1].

Since iterating doesn't consume more space, he's confused.

Andreas
  • 154,647
  • 11
  • 152
  • 247
  • Thank you so much, that looks more clear now. However, I have only question. I am not sure how you are asking me to use Map or Map, Can you please elaborate a little more on how I can apply that to my scenario? Thanks. – Sam Jun 05 '16 at 19:38
  • See [Most efficient way to increment a Map value in Java](http://stackoverflow.com/q/81346/5221149). The `MutableInt` is what I meant by `Counter`. – Andreas Jun 05 '16 at 22:17