There is no problem extracting the fractional part of a floating-point number; this can always be done exactly. Whenever two double
numbers close to each other are subtracted, the result is always exact. And whenever the modulus of two double
numbers is not greater in magnitude than either operand1, the result is always exact.
Thus, in your examples, the results of 1.4 % 1
and 1.4 - 1
are each exact; there is no arithmetic error in the operation.
The reason the results are not equal to 0.4
is that, before the operations, there are already rounding errors in 1.4
and 0.4
. The operation that converts the decimal numerals “1.4” and “0.4” to double
must round its result, because 1.4 and 0.4 are not representable in the double
format.
In the double
format, all numbers are represented as a sign and a 53-bit integer multiplied by some power of two.2 Using 1.4
in PowerShell results in +3152519739159347•2−52, which 1.399999999999999911182158029987476766109466552734375. Using 0.4
results in 7205759403792794•2−54, which equals 0.40000000000000002220446049250313080847263336181640625.
As you can see, subtracting 1 from 1.399999999999999911182158029987476766109466552734375 gives 0.399999999999999911182158029987476766109466552734375, which is clearly not equal to 0.40000000000000002220446049250313080847263336181640625. Thus (1.4 - 1) -ne 0.4
is true. Similarly 1.4 % 1
gives the same result, so (1.4 % 1) -ne 0.4
is also true.
There is no operation on 1.4
that will “extract” just the fraction part and give 0.40000000000000002220446049250313080847263336181640625, because its fraction part is not 0.40000000000000002220446049250313080847263336181640625. The problem is not in the extraction operation; the problem is that 1.4
has already lost accuracy and no longer records the complete .4 part.
In contrast, because 0.4 is lower in magnitude, it can be approximated using a scale of 2−54 instead of the 2−52 required for 1.4. That smaller scale means its approximation can be finer, so it is more accurate—0.4
is closer to 0.4 than 1.4
is to 1.4.
There is no general fix for this. Floating-point arithmetic is designed to approximate real-number arithmetic, and it generally should not be used in situations where you want to get exact real-number arithmetic. So limited exact arithmetic can be done in certain situations, but they must be carefully designed.
Footnotes
1 This is always true for a symmetric modulus, where x % y
returns a value in [−|y/2|, +|y/2|]. I suspect Microsoft uses that in PowerShell, but their documentation does not say. An asymmetric modulus, such as one that returns a value in [0, |y|), can have a rounding error.
2 There are other descriptions of the double
format in which a number is a sign, a bit, a radix-point (written as a period), and 52 more bits multiplied by a power of two. These descriptions are mathematically equivalent because the bounds on the allowed powers of two are adjusted correspondingly. The integer-based description is generally easier for number-theoretic work, although the IEEE-754 standard uses the fraction-based description.