Shouldn't the time complexity of binary search be O(ceil(logn))?

Question

This seems to be a common question and the answer can be found anywhere, but it's not the case: I cannot find an answer anywhere on the Internet. The point is, I've never seen anywhere asking a question about whether the time complexity might be O(ceil(logn)), I cannot figure it out, so I decided to ask a question here.

First, suppose I have a sorted array containing n numbers, and I want to search for a value in it using the binary search algorithm. The count of steps required in the worse case are listed below:

n	steps
1	1
2	2
3	2
4	3
5	3
6	3
7	3
8	4
9	4
10	4

As you can see, The steps required for an array of n numbers are ceil(log2n)(ceil(log2n) denotes the smallest integer that is greater than or equal to log2n). So I think the time complexity of binary search should be O(ceil(logn)), but according to Wikipedia, the time complexity should be O(logn), why? Is there something wrong?

The thing is that it does not matter whether it is ceil or floor. The Big-Oh notation is just an estimate for the run time complexity and that's why we drop constants and lower order terms while calculating it. Even if what you are saying is right, think it like this `O(ceil(logn)) <= O(logn) + 1` and now you drop the constant `1` — risingStark, Jul 16 '21 at 05:04
Does this answer your question? [What is a plain English explanation of "Big O" notation?](https://stackoverflow.com/questions/487258/what-is-a-plain-english-explanation-of-big-o-notation) — , Jul 16 '21 at 06:22
In places in Sedgewick "analysis of algorithms" he counts operations exactly, accounting for ceils and floors like you do in your question. For example, calculating the exact number of comparisons a merge sort does (which produces interesting detail in the graph). Big O throws lots of detail away, and in some of Sedgewick's work (eg with Flageolet on analytic combinatorics), they argue that big O is too coarse and used too widely. But to answer your question, as others have pointed out O(log n) = O(lg n) = O(ceil(log n)) = O(floor(log n)). — Paul Hankin, Jul 16 '21 at 08:04
To prove that the complexities are the same, I would advise you to use the definition of big O directly, rather than use "rules of thumb" like discarding constants and lower-order terms. Those two rules are quite harmless, but other rules of thumb that people follow are not quite right, or only applicable in certain situations. It's better to stick to definitions, since then you're always on solid theoretical ground. — Paul Hankin, Jul 16 '21 at 08:07

Mecki · Answer 1 · 2021-10-10T15:34:27.867

As I have already explained in two other answers (see here and here), the Big-O notation is not what most people think it is. It neither tells you anything about the speed of an algorithm nor anything about the number of processing steps.

The only thing Big-O tells you is how the processing time of an algorithm will change if the number of input elements is changing. Does it stay constant? Does it raise linearly? Does it raise logarithmically? Does it raise quadratically? This is the only thing that Big-O is answering.

Thus O(5) is the same as O(1000000) as both simply mean constant which is typically written as O(1). And O(n + 100000) is the same as O(5n + 8) as both simple mean linear which is typically written as O(n).

I know that many people will say "Yes, but O(5n) is steeper than O(2n)" and this is absolutely correct but still they are both linear and Big-O is not about comparing two functions of linear complexity with each other but about categorizing functions into coarse categories. People just get confused by the fact that these categories are named after mathematical functions, so they believe any function may make sense to be used for Big-O notation but that isn't the case. Only functions with different characteristics do get an own Big-O notation.

The following overview is nowhere near complete but in practice mainly the following Big-O notations are relevant:

O(1) - constant
O(log log n) - double logarithmic
O(log n) - logarithmic
O((log n)^c), c > 1 - polylogarithmic
O(n^c), 0 < c < 1 - fractional power
O(n) - linear
O(n log n) = O(log n!) - linearithmic
O(n^2) - quadratic
O(n^c), c > 1 - polynomial
O(c^n), c > 1 - exponential
O(n!) - factorial

Instead of writing these as functions one could also have given each of them just a name but writing them as function has two advantages: People with some math background will immediately have the image of a graph in their head and it's easy to introduce new types without coming up with fancy names as long as you can mathematically describe their graphs.

Ricola · Answer 2 · 2021-07-17T18:13:55.030

1

O(ceil(log n)) and O(log n) both represent the same asymptotic complexity (logarithmic complexity).

Or loosely put : O(ceil(log n)) = O(log n)

edited Jul 17 '21 at 18:13

answered Jul 17 '21 at 18:08

Ricola

2,621
12
22

score 0 · Answer 3 · answered Jul 17 '21 at 17:58

What you're seeing here is the difference between two different ways of quantifying how much work an algorithm does. The first approach is to count up exactly how many times something happens, yielding a precise expression for the answer. For example, in this case you're noting that the number of rounds of binary search is ⌈lg (n+1)⌉. This precisely counts the number of rounds done in binary search in the worst case, and if you need to know that exact figure, this is a good piece of information to have.

However, in many cases we're aren't interested in the exact amount of work that an algorithm does and are instead more interested in how that algorithm is going to scale as the inputs get larger. For example, let's suppose that we run binary search on an array of size 10³ and find that it takes 10μs to run. How long should we expect it to take to run on an input of size 10⁶? We can work out the number of rounds that binary search will perform here in the worst case by plugging in n = 10³ into our formula (11), and we could then try to work out how long we think each of those iterations is going to take (10μs / 11 rounds ≈ 0.91 μs / round), and then work out how many rounds we'll have with n = 10⁶ (21) and multiply the number of rounds by the cost per round to get an estimate of about 19.1μs.

That's kind of a lot of work. Is there an easier way to do this?

That's where big-O notation comes in. When we say that the cost of binary search in the worst case is O(log n), we are not saying "the exact amount of work that binary search does is given by the expression log n." Instead, what we're saying is *the runtime of binary search, in the worst case, scales the same way that the function log n scales." Through properties of logarithms, we can see that log n² = 2 log n, so if you square the size of the input to a function that grows logarithmically, it's reasonable to guess that the output of that function will roughly double.

In our case, if we know that the runtime of binary search on an input of size 10³ is 10μs and we're curious to learn what the runtime will be on an input of size 10⁶ = (10³)², we can make a pretty reasonable guess that it'll be about 20μs because we've squared the input. And if you look at the math I did above, that's really close to the figure we got by doing a more detailed analysis.

So in that sense, saying "the runtime of binary search is O(log n)" is a nice shorthand for saying "whatever the exact formula for the runtime of the function is, it'll scale logarithmically with the size of the input." That allows us to extrapolate from our existing data points without having to work out the leading coefficients or anything like that.

This is especially useful when you factor in that even though binary search does exactly ⌈lg (n+1)⌉ rounds, that expression doesn't exactly capture the full cost of performing the binary search. Each of those rounds might take a slightly different amount of time depending on how the comparison goes, and there's probably some setup work that gives an additive constant in, and the cost of each round in terms of real compute time can't be determined purely from the code itself. That's why we often use big-O notation when quantifying how fast algorithms run - it lets us capture what often ends up being the most salient details (how things will scale) without having to pin down all the mathematical particulars.

Shouldn't the time complexity of binary search be O(ceil(logn))?

3 Answers3