4

So I have been studying Big O notation ( Noob ) , and most things looks like alien language to me. Now I understand basic of log like log 16 of base2 is the power of 2 equals the number 16. Now for Binary search big O(logN) making no sense to me , what is the value of LogN exacly what is the base here? I have searched internet, problem is everyone explained this mathmetically which i cant catch I am not good with math. Can someone explain this to me in Basic English not Alien language like exponential. I know How Binary search works.

Second question: [I dont even know what f = Ω(g) this symbol means] Can someone explain to me in Plain English what is required here , I dont want the answer , just what this means. Question : In each of the following situations, indicate whether f = O(g), or f = Ω(g), or both. (in which case f = Θ(g)).

f(n)              g(n)

(a) n-100 ...............n-200

(b) 100n + logn .......n + (log n)2

(c) log2n ............... log3n

Mitch Wheat
  • 295,962
  • 43
  • 465
  • 541
Mansur Ahamed
  • 53
  • 1
  • 2
  • 7
  • " Now for Binary search big O(logN) making no sense to me , what is the value of LogN exacly what is the base here?" : the base is irrelevant in Big O notation. There are loads of good answers on SO,..... – Mitch Wheat Sep 19 '15 at 04:09
  • In Big O notation, I think of the complexity O(log N) being a problem proportional in complexity to the number of digits in that problem. So a problem on the scale of 9999 would be twice as hard as a problem on the scale of 99. If you have to guess a number from 0 to 99 without hints, finding the number is O(N). Doing the same with numbers 0 to 9999 will obviously be 100 times harder. Now if you had "higher/lower" hints, this would make the problem O(log N). Finding a number in that scenario from 0 to 9999 is only twice as hard as finding the number from 0 to 99 – WDS Sep 19 '15 at 04:55
  • The reason bases are ignored for logs is that they're all constant multiples of each other (and constant multiples are ignored in big-O notation). Look up the "change of base" formula for logs. In general, `log_a(x) == log_b(x) / log_b(a) == log_10(x) / log_10(a) == ln(x) / ln(a)`. – Blckknght Sep 19 '15 at 05:00
  • 1
    "I am not good with math." Then *get* good with math. If you want to understand computer programming, you need a good grasp of basic mathematics. High school algebra and trigonometry at least. – Jim Mischel Sep 19 '15 at 12:02
  • http://cs.stackexchange.com is your place to ask this. Your question is not directly about programming. – peterh Sep 20 '15 at 02:22

3 Answers3

7

Update: I just realized that I studied algorithms from MIT's videos. Here is the link to the first of those videos. Keep going to next lecture as far as you want.


Clearly, Log(n) has no value without fixing what n is and what base of log we are using. The purpose of mentioning log(n) so often is to help people understand the rate of growth of a particular algorithm or piece of code. It is only to help people see things in perspective. To build your perspective, see the comparison below:

1 < logn < n < nlogn < n2 < 2^n < n! < n^n

The line above says that after some value of n on the number line, the rate of growth of the above written functions is in the order mentioned there. This way, decision makers can decide which approach they want to take in solving their problem (and students can pass their Algorithm Design and Analysis exam).

Coming to your question, when books say 'binary search's run time is Log(n)', essentially they mean that the if you have n elements, the running time for binary search would be proportional to Log(n) and if you have 17n elements then you can expect the answer from your algorithm in a time duration that is proportional to Log(17n). In this case, the base of Log function is 2 because in binary search, we have exactly <= 2 paths to pick from at every node.

Since, the log function's base can be easily converted from any number to any other number by multiplying a constant, telling what the base is becomes irrelevant as in Big O notations, constants are ignored.


Coming to the answer to your second question, images will explain it the best.

Big O is only about the upper bound on a function. In the image below, f(n) = O(g(n)). In other words, there are positive constants c and k, such that 0 ≤ f(n) ≤ cg(n) for all n ≥ k.

  • Importance of k is that after 'k' this Big O will stay true, no matter what value of n. If we can't fix a 'k', we cannot say that the growth rate will always stay below the function mentioned in O(...).
  • Importance of c is in saying that it is the function between O(...) that's really important.

enter image description here


Omega is simply the inversion of Big O. If f(n) = O(g(n)), then g(n) = Ω(f(n)). In other words, Ω() is about your function staying above what is mentioned in Ω(...) for a given value of another 'k' and another 'c'.

The pictorial visualization is

enter image description here


Finally, Big theta is about finding a mathematical function that grows at same rate as your given function. But how do you prove that this function runs same as your function. By using two constant values.

Since it runs same as your given function, you should be able to multiply two constants 'c1' and 'c2' that will be able to put c1 * g(n) above your function f(n) and put c2 * g(n) below your function f(n).

The thing behind Big theta is to provide a function with same rate of growth. Note that there may be no constant 'c' that will be able to get f(n) and g(n) to overlap. Nobody is concerned with that. The only concern is to be able to sandwich the f(n) between a g(n) using two constants so that we can confidently say that we found the rate of growth of f(n).

enter image description here


How to apply the above learned ideas to your question?

Let's take each of them one by one. You can use some online tool to plot these functions and see first hand, how these function behave when you go along the number line.

  • f(n) = n - 100 and g(n) = n - 200

Here, the rate of growth can be found out by differentiating both functions wrt n. d(f(n))/dn = d(g(n))/dn = 1. Therefore, even though the running times of f(n) and g(n) may be different, their rate of growth is same. Can you pick 'c1' and 'c2' such that c1 * g(n) < f(n) < c2 * g(n)?

  • f(n) = 100n + log(n) and g(n) = n + 2(log (n))

Differentiate and tell if you can relate the functions as Big O or Big Theta or Big Omega.

  • f(n) = log (2n) and g(n) = log (3n)

Same as above.

(The images are taken from different pages on this website: http://xlinux.nist.gov/dads/HTML/)


My experience: Try to compare the growth rate of a lot of different functions. Eventually you will get the hang of it for all of them and it will become very intuitive for you. Given concentrated effort for one week or two, this concept cannot remain esoteric for anyone.

displayName
  • 13,888
  • 8
  • 60
  • 75
1

First of all, let's go through the notations. I'm assuming from the questions that O(f) is upper bound, Ω(f) is lower bound, and Θ(f) is both

For O(log(N)) in this case, generally the base isn't given because the general form of log(N) is known regardless of the base. E.g.,

log(x)
(source: rapidtables.com)

So if you've worked through the binary search algorithm (I suggest you do this if you haven't), you should find that the worst case scenario (upper bound) is log_2(N). So given N terms, it will take "log_2(N) computations" in the worst case in order to find the term.

For your second question,

You are simply comparing computational run-times of f and g.

f = O(g)

is when f is an upper bound on g, i.e., f will definitely take longer to compute than g. Alternately,

f = Ω(g)

is when f is a lower bound on g, i.e., g will definitely take longer to compute than f. Lastly,

f = Θ(g)

is when the f is both an upper and lower bound on g, i.e., the run times are the same.

You need to compare the two functions for each question and determine which will take longer to compute. As Mitch mentioned you can check here where this question has already been answered.

Edit: accidentally linked e^x instead of log(x)

Community
  • 1
  • 1
Dan
  • 93
  • 8
  • Thanks for your answer , Thats exactly what i mentioned in the question , i dont understand terms like general form of log(N) is known regardless of the base. I can only think of log as with a base. Also what is log_2(N)?. is 2 is the base here or its multiplication of 2 and N. – Mansur Ahamed Sep 19 '15 at 04:26
  • Sorry, bad habit from school. log_2(n) in this case means log base 2 (since this is binary search). I've only included it to be absolutely clear. Typically in any sort of algorithm run-time discussion, the bases are dropped completely as they are irrelevant. You'll never need to actually compute log_2(N) for example, you just need to know that O(log(N) is faster than O(N) but slower than O(N log(N)). – Dan Sep 19 '15 at 04:30
0

The reason the base of the log is never specified is because it is actually completely irrelevant. You can convince yourself of this in three steps:

First, recall that log_2(x) = log_10(x)/log_10(2). But also recall that log_10(2) is a constant, which we'll call k2, so really, log_2(x) * k2 = log_10(x)

Second, recall that this is not unique to logs of base 2. The constants of conversion vary, but all the log functions are related to each other through multiplicative constants.

(You can prove this to yourself if you understand the mathematics behind log functions, or you can just work it up very quickly on a spreadsheet-- have a column of log_2(x) and a column of log_3(x) and divide them.)

Finally, remember that in Big Oh notation, constants basically drop out as being irrelevant. Trying to draw a distinction between O(log_2(N)) and O(log_3(N)) is like trying to draw a distinction between O(N) and O(2N). It is a distinction that does not matter because log_2 and log_3 are related through a constant.

Honestly, the base of the log does not matter.

Novak
  • 4,687
  • 2
  • 26
  • 64
  • Thanks, How is log_2(x) = log_10(x)/log_10(2). Can u explain further plz. – Mansur Ahamed Sep 19 '15 at 06:23
  • @MansurAhamed I think that was a mistake. Novak probably meant to say log_2(x) = log_10(x) * log_10(2). But the logic he is trying to illustrate is sound. – WDS Sep 19 '15 at 07:14
  • There is no mistake. http://mathforum.org/library/drmath/view/70648.html – Novak Sep 19 '15 at 15:32
  • Haha! Sorry, there was indeed no mistake until I made it. I'll leave the comment for continuity, but I confess I was incorrect. – WDS Sep 20 '15 at 04:37