4

The core R engine has a serious flaw with the way it expresses output from the Modulus operation:

ceiling((1.99 %% 1) * 100)

Returns: 99 (correct)

ceiling((2.99 %% 1) * 100)

Returns: 100 (incorrect)

The behavior will manifest in any integer value N + 2.99 (e.g. 3.99, etc.). If this is tied to a floating point representation, the system is not expressing the full details of the difference. This is especially disturbing because:

Both (1.99 %% 1) and (2.99 %% 1) appear to return 0.99.

Both ((1.99 %% 1) * 100) and ((2.99 %% 1) * 100) appear to return 99.

However, if you do any rounding or similar mathematical operations, the invisible residual value for 2.99 flips things in an unexpected way.

While solving this problem for my current application is trivial:

floor((2.99 - floor(2.99)) * 100)

Returns: 99 (correct)

sprintf("%.22f", floor((2.99 - floor(2.99)) * 100))

Returns: 99.0000000000000000000000 (correct)

... I wonder how many other instances that Modulus returns bad values without the underlying detail to show the floating point delta. Is there a way to expose the underlying residual value which Modulus seems to attach? It's otherwise invisible.

EDIT: As per the generous example from andrew.punnett below, print(1.99, digits = 22) returns 1.99 (no float expansion), while print(1.99 %% 1, digits = 22) returns 0.98999999999999999. As per the astute eye of Aaron, this appears to be version and / or system dependent.

Thanks!

  • 1
    Hi, and welcome to StackOverflow! I fear you're about to have a poor first experience and so will suggest some edits to your question to head off some of this and help you hopefully get a better answer. 1) Some will assume you know nothing about floating point errors and explain condescendingly; so I suggest you edit to include that this may be an underlying reason and you want to know how to deal with this better 2) Some will take offense at pointing out a "error/flaw," again likely pointing to the floating-point issue; I suggest you rephrase that to refer to "behavior" instead. Best wishes!! – Aaron left Stack Overflow May 17 '18 at 01:18
  • I appreciate the feedback; I've made some edits. If this issue is tied to a floating point representation of a value, then the engine doesn't express it with the full set of digits -- it just shows 0.99, and that's it. In other words, (2.99 %% 1) and (2.99 - floor(2.99)) should return the same thing. They appear to return the same thing -- the engine doesn't show any difference. But the actual numbers returned are different, and R doesn't expand the float to show that difference. – TuringTester1912 May 17 '18 at 01:30
  • 1
    There also seem to be some system specific things going on; for`print(1.99, digits = 22)` I get `1.989999999999999991118`. – Aaron left Stack Overflow May 17 '18 at 02:00
  • 1
    of possible interest: [Converting non-integer decimal numbers to binary](https://stackoverflow.com/a/38844546/210673) – Aaron left Stack Overflow May 17 '18 at 02:03
  • 2
    Thanks Aaron! I'm *floored* (hah) that there's a difference in the expansion of 1.99 across versions -- that's really weird. But a big thanks for the binary representation link! I may use that in the very near future. – TuringTester1912 May 17 '18 at 02:13
  • 1
    Possible answers [here as well](https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal) – kangaroo_cliff May 17 '18 at 02:35

1 Answers1

6

This isn't really a bug in R. It is really a property of floating-point arithmetic.

The problem arises because neither 1.99 or 2.99 can be represented exactly as a floating-point number. The closest decimal number to 2.99 that can be stored in a double precision (64bit) floating-point number is 2.99000000000000021316282072803 (try the conversion here)

Therefore the expression evaluates as:

ceiling((2.99 %% 1) * 100) = ceiling(99.000000000000021316282072803)
                           = 100

Contrastingly, the nearest representation of 1.99 is 1.989999999999999991118215803 which happens to give the answer you expect:

ceiling((1.99 %% 1) * 100) = ceiling(98.9999999999999991118215803)
                           = 99

Both results are correct with respect to IEEE 754 floating-point arithmetic, but as you have seen only one agrees with the result you would get by applying the rules of real-number arithmetic.

This problem is compounded by the fact that the default behaviour in R is to truncate every floating-point number you print(). If you want to see more digits, then you must supply a digits parameter:

print(1.99, digits = 22)

However, even this doesn't give you the correct number of digits on all platforms, so a more reliable way to accurately view a floating-point number is:

cat(sprintf("%.22f\n", 1.99))
andypea
  • 1,343
  • 11
  • 22
  • Thanks very much for the detailed information! The main issue I have is that R obfuscates that floating point representation -- when I call the value from the containing variable, it does not return 99.000000000000021316282072803 or 98.9999999999999991118215803 (both of which would be useful), it just returns 0.99. Seeing *just* 0.99 is misleading in that it appears as an equivalent to a Decimal Data Type. I appreciate the heads up for specifying the digits with print(), thx! – TuringTester1912 May 17 '18 at 01:45
  • 2
    I think what @andrew.punnett is saying, though, is that R *does* return that; it just doesn't *display* it. – Aaron left Stack Overflow May 17 '18 at 01:46
  • 1
    I also find that really annoying. In fact, if you try the command `print(1.99, digits = 22)` you'll see that R insists the number is really 1.99. I'm not sure why – andypea May 17 '18 at 01:47
  • 3
    PS. Nice answer, clear but not condescending. :) Thanks. – Aaron left Stack Overflow May 17 '18 at 01:54
  • 2
    There also seem to be some system specific things going on; for `print(1.99, digits = 22)` I get `1.989999999999999991118`. – Aaron left Stack Overflow May 17 '18 at 02:02
  • 1
    Thanks for the info. I get that result with sprintf("%.21e", 1.99), but not when using print() for some reason. – andypea May 17 '18 at 02:07
  • 1
    From the print.default help page: "Note that for large values of digits, currently for digits >= 16, the calculation of the number of significant digits will depend on the platform's internal (C library) implementation of sprintf() functionality.". It turns out that for my system (Windows 10) print() will never display more than 17 digits past the decimal point. – andypea May 17 '18 at 04:23
  • 1
    Great answer! @TuringTester1912 I just wanted to mention, that you can also set a default for the number of digits to print: i.e. if you always want to see the full number of digits that `print` can show, use `options(digits = 22)` to set that as default behaviour for your session. – Mikko Marttila May 18 '18 at 09:33