0
        println(log(it.toDouble(), 10.0).toInt()+1) // n1
        println(log10(it.toDouble()).toInt() + 1) // n2

I had to count the "length" of the number in n-base for non-related to the question needs and stumbled upon a bug (or rather unexpected behavior) that for it == 1000 these two functions give different results. n1(1000) = 3, n2(1000) = 4.

Checking values before conversion to int resulted in: n1_double(1000) = 3.9999999999999996, n2_double(1000) = 4.0

I understand that some floating point arithmetics magic is involved, but what is especially weird to me is that for 100, 10000 and other inputs that I checked n1 == n2. What is special about it == 1000? How I ensure that log gives me the intended result (4, not 3.99..), because right now I can't even figure out what cases I need to double-check, since it is not just powers of 10, it is 1000 (and probably some other numbers) specifically.

I looked into implementation of log() and log10() and log is implemented as

if (base <= 0.0 || base == 1.0) return Double.NaN
    return nativeMath.log(x) / nativeMath.log(base) //log() here is a natural logarithm

while log10 is implemented as

return nativeMath.log10(x)

I suspect this division in the first case is the reason of an error, but I can't figure out why it causes an error only in specific cases. I also found this question: Python math.log and math.log10 giving different results

But I already know that one is more precise than another. However there is no analogy for log10 for some base n, so I'm curious of reason WHY it is specifically 1000 that goes wrong.

PS: I understand there are methods of calculating length of a number without fp arithmetics and log of n-base, but at this point it is a scientific curiosity.

Luoencz
  • 21
  • 3
  • 1
    Use `roundToInt()` instead of `toInt()`. – Tenfour04 Feb 16 '23 at 15:58
  • `log` reduces by factors of 2 first before applying the polynomial approximation for `log(1+x)`. This is easy as the exponent of 2 is part of the floating-point format. A factor 2^k gives addition by k*log(2) in the logarithm. It is likely that `log10` first reduces by powers of 10, perhaps using that 10^3~2^10. In the second route there are no floating-point operations that can cause floating-point noise. – Lutz Lehmann Feb 16 '23 at 16:07
  • 1
    @LutzLehmann: The latter is unlikely. `log10` is more likely written to separate the the exponent e and significand f and compute (using some form of extended precision) log10(f)+e•log[10](2). – Eric Postpischil Feb 16 '23 at 18:21
  • 2
    Different results would be entirely explained by `log10` using a routine customized to compute the base-ten logarithm and providing a highly accurate results whereas `log` performs the separately logarithms and dividing, as you show. That introduces additional floating-point rounding errors due to having two logarithm evaluations and a division. The extra rounding errors cause a different result. The fact the results are the same for some operands may be due to the roundings sometimes rounding up and sometimes down, so sometimes they happen to coincide. – Eric Postpischil Feb 16 '23 at 18:23
  • @EricPostpischil 2nd comment certainly good enough for an answer. – chux - Reinstate Monica Feb 16 '23 at 18:25
  • Re “WHY it is specifically 1000 that goes wrong”: There is likely nothing special about this. Examined at the point where rounding occurs, sometimes we see .2379…, which rounds down, sometimes .8912…, which rounds up, and so on. For 1000, the part that could not be represented simply happened to round in the direction that matched the `log10` result. It is likely mere happenstance, not the result of design or any profound pattern in the numbers. – Eric Postpischil Feb 16 '23 at 18:26
  • @chux: But then I would feel compelled to reproduce the results and report the exact numbers involved. I have reproduced it in C; `log(x) / log(10)` differs from `log10(x)` at 1,000 but not 1, 10, 10,000, or 100,000. (That is using Apple’s implementation, probably Steve Canon’s.) The problem is now I have a choice between firing up Maple and getting the ultra-high precision logarithms and quotients for display or going to make lemon bars. – Eric Postpischil Feb 16 '23 at 18:30
  • 1
    @EricPostpischil Going to the bars is best. – chux - Reinstate Monica Feb 16 '23 at 18:32
  • 1
    @Luoencz, Note that `3.9999999999999996` is the next smallest encodable value less than 4.0. Result is only 1 [ULP](https://en.wikipedia.org/wiki/Unit_in_the_last_place) away. – chux - Reinstate Monica Feb 16 '23 at 18:41
  • [Here is a previous question about patterns in floating-point numbers with powers of 10.](https://stackoverflow.com/questions/61469054/why-floating-number-is-different-every-hundreds/61582647#61582647) – Eric Postpischil Feb 16 '23 at 19:38

1 Answers1

0

but I can't figure out why it causes an error only in specific cases.

return nativeMath.log(x) / nativeMath.log(base) 
//log() here is a natural logarithm

Consider x = 1000 and nativeMath.log(x). The natural logarithm is not exactly representable. It is near

6.90775527898213_681... (Double answer) 
6.90775527898213_705... (closer answer) 

Consider base = 10 and nativeMath.log(base). The natural logarithm is not exactly representable. It is near

2.302585092994045_901... (Double) 
2.302585092994045_684... (closer answer) 

The only exactly correct nativeMath.log(x) for a finite x is when x == 1.0.

The quotient of the division of 6.90775527898213681... / 2.302585092994045901... is not exactly representable. It is near 2.9999999999999995559...

The conversion of the quotient to text is not exact.

So we have 4 computation errors with the system giving us a close (rounded) result instead at each step.

Sometimes these rounding errors cancel out in a way we find acceptable and the value of "3.0" is reported. Sometimes not.


Performed with higher precision math, it is easy to see log(1000) was less than a higher precision answer and that log(10) was more. These 2 round-off errors in the opposite direction for a / contributed to the quotient being extra off (low) - by 1 ULP than hoped.

When log(x, 10) is computed for other x = power-of-10, and the log(x) is slightly more than than a higher precision answer, I'd expect the quotient to less often result in a 1 ULP error. Perhaps it will be 50/50 for all powers-of-10.


log10(x) is designed to compute the logarithm in a different fashion, exploiting that the base is 10.0 and certainly exact for powers-of-10.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256