-3

I know that the time complexity of adding a single element in a set is O[logn] . So , if I insert N elements one by one in a set that was initially empty , what should be the time complexity of this operation ? . Please explain in detail .

PS : I think the time complexity will be O[NlogN]. Am I correct ?

Given below is the code snippet :

vector<int> v(n);
for(int i = 0 ; i < n ; i++)
{
   v[i] = i;
}
set<int> s;
for(int i = 0 ; i < n ; i++)
{
   s.insert(v[i]);
}
  • 2
    What do you *think* the answer is, and why? – Scott Hunter Jun 02 '23 at 18:15
  • 3
    are you literally asking what is N times Log(N) ? – Cubbi Jun 02 '23 at 18:15
  • 1
    @Cubbi: No, they are not. N is not a constant in this. – Eric Postpischil Jun 02 '23 at 18:16
  • @Cubbi But 'N' is not constant . – Divyanshu Dwivedi Jun 02 '23 at 18:18
  • 1
    @ChrisB: The question is what more like what is sum(O(log i) for i = 1 to n), although that is an abuse of O notation. For some functions f, sum(O(f(i)) will be O(n f(i)). For others it will not. So asking what it is for log is a reasonable question. – Eric Postpischil Jun 02 '23 at 18:18
  • Voting to reopen. Whether this is a well-posed and well-researched question or not there is no need for more detail. – user4581301 Jun 02 '23 at 18:21
  • 3
    Here is a proof that `log(1)+log(2)+log(3) + ... = O(nlogn)` [here](https://stackoverflow.com/a/2095472/3357352) – OrenIshShalom Jun 02 '23 at 18:22
  • For the code provided, in this particular case, it's `O(n)`. – Eljay Jun 02 '23 at 18:28
  • 1
    If you want to know about the complexity of insert look at the documentation for [std::set::insert](https://en.cppreference.com/w/cpp/container/set/insert) the information is there. – Pepijn Kramer Jun 02 '23 at 18:29
  • @Eljay How ? It should be NlogN . – Divyanshu Dwivedi Jun 02 '23 at 18:32
  • 1
    Because `v[i]` is always `0`, which eliminates the binary tree traversal cost because the tree only has the one node in it, and `insert` will always return `false` for .second (no insertion happened) after the first insertion. – Eljay Jun 02 '23 at 18:36
  • @Eljay For the updated code it will be NlogN . Right ? – Divyanshu Dwivedi Jun 02 '23 at 18:39
  • 1
    Yes. With the assumption that the `set` doesn't have pathological behavior from a naïve implementation — such as generating into (effectively) a singly-linked list (a very lopsided tree) if the data happens to be pre-sorted (which in this case it is). The `std::set` ought to be very smart, but if the `set` was a naïve implementation, it could have `O(²)` behavior. – Eljay Jun 02 '23 at 18:44
  • Totally unrelated: Regarding `for(int i = 0 ; i < n ; i++) { v[i] = i; }` take a look at [`std::iota`](https://en.cppreference.com/w/cpp/algorithm/iota) – user4581301 Jun 02 '23 at 18:47
  • You can speculate about this or you can just run it for varying values of N where N is a power of ten. 10, 100, 1000, etc. and surely by 10e9 you'll know what kind of curve you have if you collect data carefully. – tadman Jun 02 '23 at 19:10

1 Answers1

0

Time Complexity

Edit: Assuming "ideal" conditions, no UB, no pre-sorted elements,...

Yes, you are partially correct. Adding a single element into an std::set is indeed `O(log(N)).

Why?

std::set is implemented as a balanced binary search tree (BBST), therefore if the element is smaller than the value you would go into the left subtree and vice-versa, hence inserting an element takes O(log(N)) because you are "splitting" the tree in half every iteration.

Let's break it down:

First Block:

vector<int> v(n);
for(int i = 0 ; i < n ; i++)
{
   v[i] = i;
}

Here you initialize an std::vector of size n and use a for-loop to initialize the elements. We know that your for-loop runs n-times and using the [] operator does not cost you anything (O(1)), hence the time complexity of the first block is O(n).

Side Note: use .at() instead of the [] operator since it gives you "auto" bounds checking.

Second Block:

set<int> s;
for(int i = 0 ; i < n ; i++)
{
   s.insert(v[i]);
}

Here you initialize an std::set of integers and you use a for-loop, which takes O(n) to insert the elements that are in the std::vector into your std::set. Using .insert() takes O(log(N)) time, since in order to find the appropriate spot to insert the element it performs Binary Search under the hood. Hence your second block takes O(Nlog(N)) time.

Now if you add that together (proof) eg:

T(n) = C*f(N) 
     =< N + Nlog(N) (for all n =< 1)
     =< Nlog(N) + Nlog(N) //raise all to the highest power
     =< 2Nlog(N)

since T(n) = C*f(n)
C = 2
f(n) = Nlog(N) //--> your big O

The overall time complexity is O(Nlog(N)). I hope this is enough detail for you.

  • 1
    The `s.insert(v[i])` does not take O(log n) time. It takes O(log i) time, because the size of the set at that time is i elements (with appropriate treatment for the initial case). The question is not what is the total time for n operations that each take O(log n) time. The question is what is the total time for an operation that takes log 1 time plus an operation that takes log 2 time plus an operation that takes log 3 time, and so on, up to a final operation that takes log n time (with appropriate treatment for O notation)… – Eric Postpischil Jun 02 '23 at 23:53
  • … A proper answer requires a proof that the sum of log i for 1 ≤ i ≤ n is O(n log n). And note this is not true for all functions. For example, the sum of 2^i for 1 ≤ i ≤ n is not O(n•2^n). So you cannot just claim that one would multiply n by log n. A proof is needed. – Eric Postpischil Jun 02 '23 at 23:55