2

From my understanding, a simple for loop will have a complexity of O(n).

foreach(var record in records) {
   // ...
}

If I introduce a Hash lookup within the foreach, will that keep the complexity to O(n)?

var ids = new HashSet<int>();

foreach(var record in records) {
   bool isRecordInIdSet = ids.Contains(record.id);
}

Likewise, if the HashSet was instead a list, would that make the complexity to O(n^2)?

var ids = new List<int>();

foreach(var record in records) {
   bool isRecordInIdSet = ids.Contains(record.id);
}

I was gleaning from this question What is the lookup time complexity of HashSet<T>(IEqualityComparer<T>)? , and would love to make sure my understanding is correct. Please add any additional info that you think would be useful to understanding Big-O terminology.

Community
  • 1
  • 1
contactmatt
  • 18,116
  • 40
  • 128
  • 186

2 Answers2

3

Your interpretation is correct, hash look-up is O(1), so repeating it n times give you an O(n) complexity.

Hash lookup is constant time (i.e. O(1)) for objects with properly implemented hash code. Since you are using an int, for which the hash code is the number itself, you get a very good implementation, ensuring constant-time access.

if the HashSet was instead a list, would that make the complexity to O(n2)?

Technically, it would be O(n*m), where m represents the number of items in the id list.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • thank you for your response. So when talking about big-O, does the variable n correspond to the number of times that the algorithm will run? – contactmatt Apr 27 '17 at 18:25
  • 1
    @contactmatt Not exactly - `n` is the number that grows linearly with the number of times a step in the algorithm is going to run. For example, `for (int i = 0 ; i < N ; i += 10)` is O(N), even though it runs 1/10-th of N times. – Sergey Kalinichenko Apr 27 '17 at 18:28
0

your analysis of list vs hashmap is correct.

var ids = new HashSet<int>();

foreach(var record in records) {
   bool isRecordInIdSet = ids.Contains(record.id); // this is O(1)
}

hashtables has a constant lookup time ( O(1) ). So that for loop has time complexity O(n), where n is the number of records

var ids = new List<int>();
foreach(var record in records) {
   bool isRecordInIdSet = ids.Contains(record.id); // this is O(m), where m is the size of list ids
}

when using a list, your .Contains method has O(n) time because in the worst case the item you are looking for is at the very end of the list, so you must traverse the whole list and make n-1 comparisons. In practice, you will not have to do this everytime, but remember that Big-O notation is concerned with the worst case scenario. If the size of ids list and records list were dependent on the same variable, then this for loop has O(n^2). Otherwise, it would be O(n * m) where n is the size of records and m is the size of ids.

victor
  • 802
  • 7
  • 12