3

I have a floating point division of the form 1.0f / x with x as a float. How would I check beforehand whether the x is so close to 0.0f that the result would be +-inf / undefined? I'm not sure if the epsilon from std limits is enough.

Regards.

phuclv
  • 37,963
  • 15
  • 156
  • 475
Desperado17
  • 835
  • 6
  • 12
  • You can convert your floating point number in the string and then you can check the number after decimal. By checking number of 0s after decimal, you can decide how close this number is to '0.0f'. – krpra Sep 24 '21 at 10:22
  • 4
    @krpra You can do that with small floating point constants without having to convert to a string. – Rup Sep 24 '21 at 10:24
  • 2
    This doesn't sound like what I want. What I basically want is an epsilon that is large enough so I can be sure that for every x larger than it 1.0f / x will not yield inf or undefined. – Desperado17 Sep 24 '21 at 10:27
  • I don't understand these questions that try to predict the future. Why wouldn't you just do the division and examine the result? – user207421 Sep 25 '21 at 03:57
  • @user207421 Presumably to avoid UB if `x` is `0`..? – Jesper Juhl May 09 '23 at 21:20

3 Answers3

7

Prerequisites

C++ does not mandate IEEE-754 or a particular rounding method. For this answer, I will assume IEEE-754 is used with a binary format and round-to-nearest, ties-to-even.

Conclusion

1/x overflows iff fabs(x) <= std::ldexp(1, -std::numeric_limits<float>::max_exponent). For a constant expression, you can use std::numeric_limits<float>::min()/4.

Discussion

The choice of rounding direction at the end of the finite range is made as if the exponent range kept going. E.g., using decimal for illustration, if the highest representable finite number were 9.99•1017, the next representable number, if the exponent were not limited, would be 1.00•1018. The midpoint between those two is 9.995•1017, so numbers under that are rounded down and numbers above that are rounded up. With ties-to-even, 9.995•1017 is rounded up.

For a binary format, the greatest representable value is (2−ε)•2q, where ε is the “machine epsilon” (the ULP of 1, so 2-ε is the greatest representable significand) and q is the maximum exponent. Then the point where rounding occurs is (2−½ε)•2q.

For positive x, if 1/x < (2−½ε)•2q, the result is rounded downward. Otherwise, it is rounded upward, to ∞. Thus, the result is less than ∞ iff x > 1/((2−½ε)•2q) = 2q/(2-½ε).

1/(2-½ε) is slightly greater than ½, by less than ½ε, so the greatest representable value less than or equal to it is ½. Thus, for positive x, the result of 1/x is less than ∞ iff x > 2q/2 = 2q−1, and the situation is symmetric for negative x.

C++ tells us the maximum exponent with std::numeric_limits<double>::max_exponent (defined in the header <limits>). However, C++ calibrates this exponent for a significand range of [½, 1) instead of IEEE-754’s [1, 2), so it is one greater than q. Thus the −q−1 we want is simply -std::numeric_limits<double>::max_exponent.

We can calculate 2q−1 with the ldexp function (declared in <cmath>): std::ldexp(1, -std::numeric_limits<float>::max_exponent).

With Apple Clang 11, this program:

#include <cmath>
#include <iomanip>
#include <iostream>
#include <limits>


int main(void)
{
    float x = std::ldexp(1, -std::numeric_limits<float>::max_exponent);

    std::cout << std::setprecision(20) << x << " is too small, result will overflow:\n";
    std::cout << "\t" << 1/x << ".\n";

    x = std::nexttoward(x, INFINITY);

    std::cout << std::setprecision(20) << x << " is just big enough, result will not overflow:\n";
    std::cout << "\t" << 1/x << ".\n";
}

produces:

2.9387358770557187699e-39 is too small, result will overflow:
    inf.
2.9387372783541830947e-39 is just big enough, result will not overflow:
    3.4028220466166163425e+38.

Accounting for negative numbers as well, 1/x overflows iff fabs(x) <= std::ldexp(1, -std::numeric_limits<float>::max_exponent).

Because of the way IEEE-754 specifies the exponent range, std::ldexp(1, -std::numeric_limits<float>::max_exponent) equals std::numeric_limits<float>::min()/4. (IEEE-754 specifies the minimum normal exponent to be 1−q, so the −q−1 we desire is (1-q)-2.)

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • Is there a reliable way to check for IEEE 754 support nowadays in C++? Some claim even std::numeric_limits::is_iec559; isn't 100% reliable. – Desperado17 Sep 24 '21 at 14:44
  • 1
    That's curious, do you know on what basis those claims of unreliability are made? – saxbophone Sep 24 '21 at 15:51
  • 1
    @saxbophone https://stackoverflow.com/questions/5777484/how-to-check-if-c-compiler-uses-ieee-754-floating-point-standard The discussion after the answer. – Desperado17 Sep 24 '21 at 16:30
6

We could search for the limit using trial and error:

#include <iostream>
#include <limits>

#include <cmath>

int main() {
    float limit = 0.0f;
    float result = 1.0f / limit;
    while (
        result == std::numeric_limits<float>::infinity()
        or std::isnan(result)
    ) {
        limit = std::nextafter(limit, 1.0f);
        result = 1.0f / limit;
    }
    std::cout << "Limit = " << limit << std::endl;
    std::cout << "1.0f / Limit = " << 1.0f / limit << std::endl;
}

This outputs on my system:

Limit = 2.93874e-39
1.0f / Limit = 3.40282e+38

This is however, not a very efficient solution. If we could make this algorithm constexpr, this would mitigate the issue but unfortunately std::nextafter() isn't constexpr.

If you know your environment is using IEEE-754 airthmetic, these limits are probably constant, but as you ask for portability, we can't always assume that.

saxbophone
  • 779
  • 1
  • 6
  • 22
  • 1
    this can be made `constexpr` so that the constant is only calculated once at compile time – phuclv Sep 24 '21 at 11:22
  • 1
    I also did some similar tests and arrived at a number *very* similar to `2.93874e-39`. But what surprises me is that that value is ***less than*** `std::numeric_limits::min()` (which is `1.17549e-38`). Did I miss something? – Adrian Mole Sep 24 '21 at 11:24
  • I think `min()` might be smallest non subnormal or denormal float value. Perhaps the quoted figure is one of the subnormals or denormals? is the value smaller than `std::numeric_limits::denorm_min()` ? – saxbophone Sep 24 '21 at 11:25
  • Alas, `nextafter()` does not appear to be `constexpr`, so I'm not sure this implementation can be made `constexpr`. – saxbophone Sep 24 '21 at 11:39
0

Since you're looking for constant values, we can actually use an SMT solver to find the minimum/maximum values of x where the division 1/x will produce infinity. I did this using Microsoft's z3 SMT solver (https://github.com/Z3Prover/z3), scripting it from Haskell SBV bindings (http://leventerkok.github.io/sbv/). I get:

Prelude Data.SBV> optimize Lexicographic $ do x <- sFloat "x"; constrain (fpIsInfinite (1/x)); minimize "x" x
Optimal model:
  x   = -2.938736e-39 :: Float
  x_0 =    2145386495 :: Word32
Prelude Data.SBV> optimize Lexicographic $ do x <- sFloat "x"; constrain (fpIsInfinite (1/x)); maximize "x" x
Optimal model:
  x   = 2.938736e-39 :: Float
  x_0 =   2149580800 :: Word32

If you squint at this, you'll find that the values it's suggesting are between -2.938736e-39 and 2.938736e-39 where 1/x becomes infinity.

If you want to write this "portably" without any rounding issues, you should use the hexadecimal notation, i.e., -0x1p-128 and 0x1p-128.

I believe these numbers match @Eric Postpischill's values; and his analysis is very useful of course, but this is another way of finding such values using automated theorem-proving techniques.

alias
  • 28,120
  • 2
  • 23
  • 40