5

Today, I came across quite strange problem. I needed to calculate string length of a number, so I came up with this solution

// say the number is 1000
(int)(log(1000)/log(10)) + 1

This is based on mathematical formula

log10x = lognx/logn10 (explained here)

But I found out, that in C,

(int)(log(1000)/log(10)) + 1

is NOT equal to

(int) log10(1000) + 1

but it should be.

I even tried the same thing in Java with this code

(int) (Math.log(1000) / Math.log(10)) + 1
(int) Math.log10(1000) + 1

but it behave the same wrong way.

The story continues. After executing this code

for (int i = 10; i < 10000000; i *= 10) {
   System.out.println(((int) (Math.log10(i)) + 1) + 
                " " + ((int) (Math.log(i) / Math.log(10)) + 1));
}

I get

2 2
3 3
4 3  // here second method produces wrong result for 1000
5 5
6 6
7 6  // here again

So the bug seems to occur on every multiple of 1000.

I showed this to my C teacher, and he said that it might be caused by some type conversion error during log division, but he didn't know why.

So my questions are

  • Why isn't (int) (Math.log(1000) / Math.log(10)) + 1 equal to (int) Math.log10(1000) + 1 , while it should be, according to the math.
  • Why is it wrong only for multiples of 1000?

edit: It is not rounding error, because

Math.floor(Math.log10(i)) + 1
Math.floor(Math.log(i) / Math.log(10)) + 1

produce same, wrong output

2 2
3 3
4 3
5 5
6 6
7 6

edit2: I have to round down, because I want to know the number of digits.

log10(999) + 1 = 3.9995654882259823
log10(1000) + 1 =  4.0

If I just round, I get same result (4), which is wrong for 999, because it has 3 digits.

Jakub Arnold
  • 85,596
  • 89
  • 230
  • 327
  • 2
    The unrounded output from these is: log10(1000) = 4.0 log(1000)/log(10) = 3.9999999999999996 What was the justifcation for using `floor` over using `round`? – pjp Sep 30 '09 at 11:10
  • 1
    Try log10(999) + 1, you get 3.9995654882259823, which must be floored, because I want number of digits. If I round, I get 4, which is same as log10(1000) + 1, but it is one digit longer – Jakub Arnold Sep 30 '09 at 11:12
  • 2
    Better use Integer.toString(n).length() to get the length. – starblue Sep 30 '09 at 11:22
  • @starblue: Yep sure, but my teacher wanted to implement it in mathematical way – Jakub Arnold Sep 30 '09 at 11:24
  • You should use a library that supports decimal numbers. – Georg Schölly Sep 30 '09 at 11:32
  • 5
    You have a programming teacher who doesn't understand finite precision. Bad university. – erikkallen Sep 30 '09 at 11:54
  • Take a look at this question and the associated source for computing log using BigDecimals http://stackoverflow.com/questions/739532/logarithm-of-a-bigdecimal – pjp Sep 30 '09 at 13:39

8 Answers8

23

You provided the code snippet

for (int i = 10; i < 10000000; i *= 10) {
   System.out.println(((int) (Math.log10(i)) + 1) + 
                " " + ((int) (Math.log(i) / Math.log(10)) + 1));
}

to illustrate your question. Just remove the casts to int and run the loop again. You will receive

2.0 2.0
3.0 3.0
4.0 3.9999999999999996
5.0 5.0
6.0 6.0
7.0 6.999999999999999

which immediately answers your question. As tliff already argued, the casts cut off the decimals instead of rounding properly.

EDIT: You updated your question to use floor(), but like casting floor() will round down and therefore drop the decimals!

janko
  • 4,123
  • 28
  • 26
  • If I don't cast down, and execute this Math.log10(9) + 1), it gets me something like 1.954, which is 1, because 2 would be for Math.log10(10) + 1 – Jakub Arnold Sep 30 '09 at 11:10
  • gs: Err, no, 4 has an exact floating point representation. The error arises from inexact values prior to the division. – caf Sep 30 '09 at 11:25
  • Why was this marked as correct answer? Did you found how to differentiate log10(999) from log10(1000) in your problem? – Aurelien Ribon Sep 30 '13 at 08:11
8

The log operation is a Transcendental Function. The best a computer can do to evaluate the result is to use an Algebraic Function which approximates the required operation. The accuracy of the result is dependent on the algorithm the computer uses (this could be the microcode in the FPU). On the Intel FPU, there are settings the affect the precision of the various transcendental functions (trig functions are also transcendental) and the FPU specifications will detail the level of accuracy of the various algorithms used.

So, in addition to the rounding errors mentioned above, there is also an accuracy issue as well since computed log(x) may not be equal to actual log(x).

Skizz
  • 69,698
  • 10
  • 71
  • 108
5

This is due to precision and rounding issues. Math.log(1000) / Math.log(10) is not precisely equal to 3.

If you need exact precision, don't use floating point arithmetic - and give up on logarithms in general. Floating point numbers are inherently fuzzy. For a precise result, use integer arithmetic.

I really suggest you don't go down this path in general, but it sounds like you're taking the logarithm of whole numbers to determine some order of magnitude. If that's the case, then (int)(Math.log(x+0.5) / Math.log(10)) will be more stable - but realize that double's have only 53 bits of precision, so around 10 the 15th doubles can no longer represent integers exactly, and this trick won't work then.

Eamon Nerbonne
  • 47,023
  • 20
  • 101
  • 166
4

Add a very small value to the numerator to bypass the accuracy issue pointed by Skizz.

// say the number is 1000
(int)((log(1000)+1E-14)/log(10)) + 1

1E-14 should be enough to nudge the accuracy back in track.

changed the small value from 1E-15, which would give incorrect results for some inputs

I tested with 1E-14 for a random sample of unsigned long longs and all my numbers passed.

pmg
  • 106,608
  • 13
  • 126
  • 198
  • 1
    1E-15 is too small, doesnt work for "1000"... better use 0.5 as suggested above. – user85421 Sep 30 '09 at 12:35
  • You're right ... but 0.5 suggests you're trying to solve a rounding issue. 1E-14 (with IEEE 64-bit doubles) works for 1000 and, at least, for the few values I tested between 0 and 2^64-1. – pmg Sep 30 '09 at 13:43
2

Updated: it's due to precision and rounding errors

Mitch Wheat
  • 295,962
  • 43
  • 465
  • 541
0

if you want to have your result as an integer, you should probably round and not just cut off the part after point.

You are probably getting something like 6.999999 and round it down to 6.

tliff
  • 1,714
  • 1
  • 16
  • 17
  • I actually cannot round, because when you have log10(100) and log10(99), the seond one is slightly below 2, so when I would round, it would give me same result as for 100, which is wrong .. I could use floor() function to round down though – Jakub Arnold Sep 30 '09 at 11:01
0

With (int) casting, you're cutting off the necessary decimal part. Try printing them as doubles without casting (why are you casting anyway?), and you'll be fine.

Michael Foukarakis
  • 39,737
  • 6
  • 87
  • 123
0

Print out the intermediate results, i.e. log(1000), log(10), log(1000)/log(10) and log10(1000). This should give better hints than guessing.

Secure
  • 4,268
  • 1
  • 18
  • 16