1
#include <unordered_map>
using namespace std;

        int lengthOfLongestSubstring(string s) {
            
            unordered_map<char, int> ht;
            int max = 0;
            
            int i = 0; int j = 0; 
            while (i < s.size() && j < s.size()) {
                
                if (ht.find(s.at(j)) == ht.end()) {
                    ht.insert( {s.at(j), j} );
                    max = (j - i + 1 > max) ? j - i + 1 : max;
                    j++;
                }
                else {
                    int len = ht.at(s.at(j)) + 1;
                    while(i < len) {
                        ht.erase(s.at(i));
                        i++;
                    }
                    
                }
                
            }
            
            return max;
        }

Is the time complexity O(n) or O(n^2)? Why?

My intuition is that it's O(n), i and j iterate across the same length.

csguy
  • 1,354
  • 2
  • 17
  • 37

1 Answers1

1

It's:

  • Strictly speaking: O(2n), as in the worst case, you iterate two times of the size of input;
  • Speaking in the Asymptotic Language: O(n), because constant factor is disregarded.
Giorgi Tsiklauri
  • 9,715
  • 8
  • 45
  • 66
  • 1
    Not entirely true, it only holds **on average** since the hash map operations are `O(size)` in worst case and I am not too confident that the map is bounded in size by a constant. – Quimby Jul 27 '20 at 21:15
  • Not really true. HashMap operations are Θ(1), true;however, O(n) in the `HashMap` operations is so called an *exceptionally rare* moment: when either *rehashing* is happening, or when the *Separate Chaining* takes place for the collision resolution. So, the client of this code can easily consider this as a O(n). – Giorgi Tsiklauri Jul 27 '20 at 21:33
  • 1
    "Strictly speaking: `O(2n)`, as in the worst case, you iterate two times of the size of input;" - Saying that `O(2n)` is more precise than `O(n)` makes no sense, because you haven't specified any units. You loop over the input two times, but if you were counting the number of basic operations or processor instructions it would be more than `2n`, and if you were measuring the number of seconds it would be less than `2n`. Keeping constant factors only make sense when you specify a unit, and in that case you shouldn't use big O notation. – eesiraed Jul 27 '20 at 22:59
  • @BessieTheCow I'm sorry, but I'm not sure you understand what is Asymptotic Analysis. – Giorgi Tsiklauri Jul 28 '20 at 01:59
  • @GiorgiTsiklauri My point is that `O(2n)` and `O(n)` are *exactly* equivalent. You said "Strictly speaking: `O(2n)`", which I interpret as meaning that `O(2n)` is somehow more precise than `O(n)`, and that's not true. Also, big theta notation does not mean average complexity. It just means that it's bounded both above and below. – eesiraed Jul 28 '20 at 03:30
  • 1
    "however, O(n) in the `HashMap` operations is so called an *exceptionally rare* moment" - It doesn't really matter here since the keys are chars, but it's only rare if you assume that the input is random. If an adversarial is feeding your program input that is designed to cause hash collisions, it becomes not-so-rare. There have been real vulnerabilities because people assumed hash tables are always O(1). – eesiraed Jul 28 '20 at 03:37
  • @GiorgiTsiklauri How do you know they are rare without knowing the input nor the implementation? No, they are not Θ(1), they are on average Θ(1), that is a big difference. Complexity of `unordered_map` should at least be mentioned in the answer because hiding the complexity into such complex structures and then counting the loops is misleading, especially to beginners. – Quimby Jul 28 '20 at 11:12
  • @Quimby "No, they are not Θ(1), they are on average Θ(1), that is a big difference." - whaaat? Please Google for what does Θ mean in the asymptotic analysis. Also, the core of the implementations of Dictionary ADT (HashMap) is pretty much very similar to each other, and moreover - if the HashMap uses rehashing or probing which results in a frequent collisions, this implementation is horrible and will very unlikely get to see the light out there in some API. I'm avoiding long discussions here, I've written more than enough. Use Google. Good luck. – Giorgi Tsiklauri Jul 28 '20 at 11:46
  • @GiorgiTsiklauri Yes, I know very well what it means, do you? I am saying that they are not constant in the worst case as the answer might imply, but only on average. Do you understand that? I do not care about the implementation, the standard explicitly allows worst-case O(n), hence my first comment. Your answer is simply incomplete. – Quimby Jul 28 '20 at 11:54