I was asked a question in which we're looking for duplicate social security numbers, as in ssn that are in both array A and also array B. My immediate thought was to iterate over each array storing the count of each ssn in a HashMap.Then of course, all keys which have count of more than 1 are duplicates. My understanding was that this would be an O(n + m) operation, where m and n are the number of elements in each array. I was told that this is "very inefficient" in regards to time-complexity and that the time complexity was actually O(n * m). How is this correct?
Asked
Active
Viewed 61 times
0
-
2Yeah, that's definitely O(n+m). – Joe C Feb 05 '17 at 22:14
-
1"I was told..." Told by whom? Why does it matter? It sounds pretty strange. It's linear on average. It's quadratic in the worst case. – kraskevich Feb 05 '17 at 22:15
-
@JoeC that's what I was thinking. How do you correct an interviewer without sounding like a complete tool? – Elroy Jetson Feb 05 '17 at 22:15
-
You are right, it takes `O(n + m)` time. And I believe that is the most efficient it can get. – halileohalilei Feb 05 '17 at 22:15
-
@kraskevich can you please explain the quadratic worst case – Nic Feb 05 '17 at 22:17
-
@kraskevich by the person conducting the interview – Elroy Jetson Feb 05 '17 at 22:17
-
5The right tact to take is to ask the interviewer why he believes that he is right. Either he will realize his mistake and correct himself, or you will have an opportunity to clarify your answer. – Joe C Feb 05 '17 at 22:17
-
1The quadratic worst case would be if the hash code were absolutely useless in that it put all of the strings in the same hash bucket. – Joe C Feb 05 '17 at 22:19
-
Thanks, I didn't conisder [that](http://stackoverflow.com/questions/10102337/can-javas-hashcode-produce-same-value-for-different-strings) – Nic Feb 05 '17 at 22:27
-
So is O(n + m) faster than O(nlogm) ? He wanted me to sort and then use binary search on one of the arrays...? I'm a little confused now. Again, my understanding is that O(n + m) is faster. I'm doubting that I understand anything at this point. – Elroy Jetson Feb 05 '17 at 22:29
-
@ElroyJetson, imo at this point you should stop applying for the position, unless a miraculous explanation is provided by the interviewer. – user3707125 Feb 05 '17 at 22:34
-
@ElroyJetson Sort of array A with `n` values is *O(n* log *n)*, then binary search of all `m` values from array B would be *O(m* log *n)*, which means a total of *O((n+m)* log *n)*, and hence worse than your suggestion, which is *O(n+m)*. – Andreas Feb 05 '17 at 22:48
-
Hashmap operations are not free. You have to take into consideration the time complexity of hashmap also where you said you will store. Otherwise you will have to do sort and search – prajeesh kumar Feb 06 '17 at 04:56
1 Answers
1
First of all, O(n+m)
is faster than O(n*m)
. Time complexity of O(n*m)
is going to be the case when you have two loops like this :
for x in [1..n]
for y in [1..m]
So, the term "very inefficient" doesn't apply at all, unless there is a crunch of extra space that O(n+m)
incurs. In that case you can clarify what is the most important factor time complexity or space complexity.

Saurav Sahu
- 13,038
- 6
- 64
- 79