1

I am dealing with floating point very small in magnitude and would like to round to 0 whenever the floating point cannot be represented by a double because it is too small in magnitude.

This code:

#include <iostream>
#include <cmath>

int main() {
  double x{100.0};
  double y{(-1.0) * 1.7976931348623157e308};
  double z{std::pow(x, y)};
  std::cout << typeid(z).name() << ": " << z << std::endl;
  std::cout << (z == 0) << std::endl;
  return 0;
}

prints

$ ./a.out
d: 0
1

for me (clang version 11.0.0, -std=c++11), as desired.

Question: Do expressions that result in a floating point too small in magnitude to be represented by a double always evaluate to 0 when assigned to a double? (including for other C++ compilers?) If not, how can I achieve this behavior, or test for such an expression being too small in magnitude?

Edit: As @Eljay pointed out, I could test if the expression results in a denormalized double.

The solution then would be to test std::fpclassify(z) == FE_SUBNORMAL and if so to set z equal to zero. This solves my problem.

I should point out that in my question I asked for setting z equal to zero when the expression assigned to it is not representable as double due to floating point underflow. A double that classifies as FE_SUBNORMAL is representable by a double, so technically the answer by @eerorika is correct.

Jasper Braun
  • 109
  • 8
  • 1
    Does this answer your question? [Is floating point math broken?](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) – ChrisMM May 24 '20 at 16:45
  • 2
    Check if the number is denormalized, then set it to zero. – Eljay May 24 '20 at 16:49
  • I don't understand fully your question. Do you ask whether `double` has a state, which is non-zero, but a small (yet unspecified) number? – geza May 24 '20 at 16:57
  • @ChrisMM I can't figure out how it would. – Jasper Braun May 24 '20 at 17:03
  • [What Every Computer Scientist Should Know About Floating-Point Arithmetic](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) – Jesper Juhl May 24 '20 at 17:04
  • @geza I am asking if an expression that is assigned to a `double` but evaluates to a floating point too small to be represented as such automatically evaluates to `0`. – Jasper Braun May 24 '20 at 17:05
  • @JesperJuhl This is part of one of the answers of the question ChrisMM suggested. That article seems to go into how floating point arithmetic works in general and the challenges associated with it in computer science, but my question relates to a specific challenge in C++. – Jasper Braun May 24 '20 at 17:08
  • @Eljay I googled "how to denormalize a double" and ended up finding out about `std::fpclassify` defined in ``. That function tells me that `z` in the above example is `FP_ZERO`. Is this what you had in mind when you said "check if the number is denormalized"? Does `FP_ZERO` mean I still have to set it to zero? – Jasper Braun May 24 '20 at 17:15
  • Most floating point numbers cannot accurately be stored in a computer, so you get rounding errors. Something which might mathematically be close to zero, but not actually zero, can be represented as `0` in a computer. or `0.0000...1` – ChrisMM May 24 '20 at 17:15
  • 1
    You'll want to check [`std::fpclassify`](https://en.cppreference.com/w/cpp/numeric/math/fpclassify) against `FP_SUBNORMAL` to see if the variable is denormalized. If the variable is classified as `FP_ZERO`, it's either `+0.0` or `-0.0`. – Eljay May 24 '20 at 17:24
  • 3
    @JasperBraun Each expression has a type already, there is no general floating point result. `pow` already returns a `double`. If it is zero, then it is zero. If not, it's not. If mathematically the result of exponentiation is non-zero, but a very small number, which is too small for `double`, then yes, `pow` will return zero. And it may set `errno` to `ERANGE` as well, so you'll know that it has been underflowed. – geza May 24 '20 at 17:31
  • @Eljay In the above example `std::fpclassify(z) == FP_SUBNORMAL` evaluates to `false` though. That would not achieve the desired behavior then. Should an expression resulting in a floating point too small in magnitude to be represented by a `double` result in a `double` classified as `FP_SUBNORMAL`? – Jasper Braun May 24 '20 at 17:33
  • 1
    @geza Are you sure it will return 0? Isn't default rounding mode implementation-defined? – Daniel Langr May 24 '20 at 17:35
  • 1
    @DanielLangr: according to the C++ standard, I think it can return a non-zero value. It doesn't make too much sense tough, because it is not hard to handle this case correctly. Maybe in edge cases (where the result is halfway between 0 and the smallest subnormal number) we get a non-zero result. But for really small magnitude numbers, I'd be very surprised if an implementation would give a non-zero result instead of zero. – geza May 24 '20 at 17:45
  • @geza Agree. Just out of curiosity, this situation can be simulated. Check this [live demo](https://godbolt.org/z/vLX8mN). (Note that the exponent can be represented exactly.) – Daniel Langr May 24 '20 at 17:50
  • 1
    If the example (on your architecture) results in the denormalized float underflowing to zero, then the number is already zero. No further effort to make it zero is necessary. – Eljay May 24 '20 at 18:53
  • 1
    @ChrisMM: When somebody asks a specific question about floating-point, please do not mark it as a duplicate of that general question about floating-point. – Eric Postpischil May 24 '20 at 21:23
  • 1
    @ChrisMM: “Most floating point numbers cannot accurately be stored in a computer…” A floating-point number is one represented with a sign and some digits multiplied by a base exponentiated by some power. This inherently can be stored in a computer, simply by storing the sign, the digits, and the power. You may mean most real numbers cannot be represented in a computer. – Eric Postpischil May 24 '20 at 21:25

1 Answers1

1

Do expressions that result in a floating point too small in magnitude to be represented by a double always evaluate to 0

Not necessarily. The result depends on the rounding mode. For example, if FE_UPWARD is used, then you would get a very small non-zero result - assuming rounding modes are supported by the language implementation.

If not, how can I achieve this behavior

FE_TOWARDZERO rounding mode should behave like that - again, assuming rounding modes are supported.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • I can't observe a difference. After executing `std::fesetround(FE_TOWARDZERO)`, I still get `std::numeric_limits::min() / 10` to evaluate to something other than `0`. It appears that setting rounding mode to FE_TOWARDZERO only achieves rounding to an ever-so-slightly smaller very small non-zero result. – Jasper Braun May 24 '20 at 18:13
  • 2
    @JasperBraun The problem there is that the result of the division is in fact not less than smallest representable value. `std::numeric_limits::min` does not return smallest representable value, but smallest **normalised** value. Given a floating point representation with subnormals, that division would simply result in a subnormal value. – eerorika May 24 '20 at 18:39