4

Still getting a grip on logarithms being the opposite of exponentials. (Would it also be correct to describe them as the inversion of exponentials?)

There are lots of great SO entries already on Big-O notation including O(log n) and QuickSort n(log n) specifically. Found some useful graphs as well.

In looking at Divide and Conquer algorithms, I'm coming across n log n, which I think is n multiplied by the value of log n. I often try concrete examples like 100 log 100, to help visualize what's going on in an abstract equation.

Just read that log n assumes base 10. Does n log n translate into:

"the number n multiplied by the amount 10 needs to be raised to the power of in order to equal the number n"?

So 100 log 100 equals 200 because 10 needs to be raised to the power of two to equal 100?

Does the base change as an algorithm iterates through a set? Does the base even matter if we're talking in abstractions anyway?

Armen Michaeli
  • 8,625
  • 8
  • 58
  • 95
MikeiLL
  • 6,282
  • 5
  • 37
  • 68
  • 2
    possible duplicate of [Big O notation Log Base 2 or Log Base 10](http://stackoverflow.com/questions/20709267/big-o-notation-log-base-2-or-log-base-10) – harold Aug 29 '14 at 16:43
  • really? so for example in Python I would look up logarithm in the PyDocs? Ah I'm already seeing that there's a `math.log10` in python where `math.log10(100)` returns the expected `2`. But `math.log(100)` returns `4.605170185988092`. – MikeiLL Aug 29 '14 at 16:46
  • @harold - and that's already a possible duplicate. maybe I should delete this question. and I think it does answer my question - (although I'm still confused). : ) – MikeiLL Aug 29 '14 at 16:47

5 Answers5

8

Yes, the base does change depending on the way it iterates, but it doesn't matter. As you might remember, changing the base of logarithms means multiplying them by a constant. Since you mentioned that you have read about Big-O notation, then you probably already know that constants do not make a difference (O(n) is the same as O(2n) or O(1000n)).

EDIT: to clarify something you said - "I'm coming across n log n, which I think is n multiplied by the value of log n". Yes, you are right. And if you want to know why it involves log n, then think of what algorithms like divide and conquer do - they split the input (in two halves or four quarters or ten tenths, depending on the algorithm) during each iteration. The question is "How many times can that input be split up before the algorithm ends?" So you look at the input and try to find how many times you can divide it by 2, or by 4, or by 10, until the operation is meaningless? (unless the purpose of the algorithm is to divide 0 as many times as possible) Now you can give yourself concrete examples, starting with easy stuff like "How many times can 8 be divided by 2?" or ""How many times can 1000 be divided by 10?"

RockOnRockOut
  • 751
  • 7
  • 18
  • I remember that it basically comes down to the degree at which the load rises in relation to the amount of data being handled. "changing the base of logarithms means multiplying them by a constant" isn't exactly in my tool belt yet, but I kind of get it. – MikeiLL Aug 29 '14 at 16:58
  • 1
    [This](http://www.purplemath.com/modules/logrules5.htm) explains it the best. As you can see in the formula for base change, log base d of x equals log base b of x multiplied by a constant, in this case 1/(log base d of b) – RockOnRockOut Aug 29 '14 at 17:03
8

You don't need to worry about the base - if you're dealing with algorithmic complexity, it doesn't matter which base you're in, because the difference is just a constant factor.

Fundamentally, you just need to know that log n means that as n increases exponentially, the running time (or space used) increases linearly. For example, if n=10 takes 1 minute, then n=100 would take 2 miuntes, and n=1000 would take 3 minutes - roughly. (It's usually in terms of upper bounds, with smaller factors ignored... but that's the general gist of it.)

n log n is just that log n multiplied by n - so the time or space taken increases "a bit faster than linearly", basically.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
5

The base does not matter at all. In fact people tend to drop part of the operations, e.g. if one has O(n^4 + 2*n), this is often reduced to O(n^4). Only the most relevant power needs to be considered when comparing algorithms.
For the case of comparing two closely related algorithms, say O(n^4 + 2*n) against O(n^4 + 3*n), one needs to include the linear dependency in order to conserve the relevant information.

Consider a divide and conquer approach based on bisection: your base is 2, so you may talk about ld(n). On the other hand you use the O-notation to compare different algorithms by means of the same base. This being said, the difference between ld, ln, and log10 is just a matter of a general offset.

fuesika
  • 3,280
  • 6
  • 26
  • 34
4

Logarithms and Exponents are inverse operations.

if
x^n = y

then
Logx(y) = n

For example,

10^3 = 1000
Log10 (1000) = 3

Divide and conquer algorithms work by dividing the problem into parts that are then solved as independent problems. There can also be a combination step that combines the parts. Most divide and conquer algorithms are base 2 which means they cut the problem in half each time. For example, Binary Search, works like searching a phone book for a name. You flip to the middle and say.. Is the name I'm looking for in the first half or last half? (before or after what you flipped to), then repeat. Every time you do this you divide the problem's size by 2. Therefore, it's base 2, not base 10.

Order notation is primarily only concerned with the "order" of the runtime because that is what is most important when trying to determine if a problem will be tractable (solvable in a reasonable amount of time).

Examples of different orders would be:

O(1)
O(n)
O(n * log n)
O(n^2 * log n)
O(n^3.5)
O(n!)

etc.

The O here stands for "big O notation" which is basically provides an upper bound on the growth rate of the function. Because we only care about the growth of the function for large inputs.. we typically ignore lower order terms for example

n^3 + 2 n^2 + 100 n 

would be

O(n^3) 

because n^3 is the largest order term it will dominate the growth function for large values of N.

When you see O(n * log n) people are just abbreviating... if you understand the algorithm it is typically Log base 2 because most algorithms cut the problem in half. However, it could be log base 3 for example if the algorithm cut the problem into thirds for example.

Note:

In either case, if you were to graph the growth function, it would appear as a logarithmic curve. But, of course a O(Log3 n) would be faster than O(Log2 n).

The reason you do not see O(log10 n) or O(log3 n) etc.. is that it just isn't that common for an algorithm to work better this way. In our phone book example you could split the pages into 3 separate thirds and compare inbetween 1-2 and 2-3. But, then you just made 2 comparisons and ended up knowing which 1/3 the name was in. However, if you just split it in half each time you'd know which 1/4 it was in which is more efficient.

Alan
  • 7,875
  • 1
  • 28
  • 48
1

In the vast set of programming languages I know, the function log() is intended to be base e=2.718281....

In mathematical books sometimes it means base "ten" and sometimes base "e".

As another answers pointed out, for the big-O notation does not matter, because, for all base x, the complexities O(log_x (n)) is the same as O(ln(n)) (here log_x means "logarithm in base x" and ln() means "logarithm in base e").

Finally, it's common that, in the analysis of several algorithms, it's more convenient consider that log() is, indeed, "logarithm in base 2". (I've seen some texts taking this approach). This is obviously related to the binary representation of numbers in the computers.

pablo1977
  • 4,281
  • 1
  • 15
  • 41