2

I'm having issues with the following line of code:

float temp = std::stof("-8.34416e-46");

My program aborts as soon as it reaches this line. It's clear that this float is less than FLT_MIN, but such floats are allowed to exist. (For example, float temp = -8.34416e-46; works fine. Is the stof method supposed to only work for values between FLT_MIN and FLT_MAX?

If so, what would be a good alternative to get a string ("-8.34416e-46") into a float?

  • 2
    "float temp = -8.34416e-46; works fine" [are you sure?](https://godbolt.org/z/h8653f) – 463035818_is_not_an_ai Dec 05 '20 at 19:09
  • 1
    Please clarify "such floats are allowed to exist." They are simply not representable, i.e. they cannot exist. You might be misunderstanding the implicit narrowing conversion going in `float temp = -8.34416-e46`. [The `std::stof` documentation](https://en.cppreference.com/w/cpp/string/basic_string/stof) is quite clear about throwing exceptions when the representable range is exceeded. – alter_igel Dec 05 '20 at 19:11
  • 1
    Re "I am having issues": What kind of issues specifically? I would suggest clarifying the question. – njuffa Dec 05 '20 at 19:24
  • 1
    @alterigel: `float` may have representable values that are greater than zero and less than `FLT_MIN`; it is incorrect to say they cannot exist. `FLT_MIN` is the minimum **normal** `float` value and is commonly 2^−126, but `float` may have subnormal values, and the minimum positive value is commonly 2^−149. – Eric Postpischil Dec 05 '20 at 19:35
  • 2
    Duplicate of [this question](https://stackoverflow.com/questions/48086830/stdstod-throws-out-of-range-error-for-a-string-that-should-be-valid) except for `float` versus `double` and the second question requesting an alternative. – Eric Postpischil Dec 05 '20 at 19:39
  • @alterigel I had read this question which led me to believe smaller values could exist, though I may have misunderstood it! Thanks for the comment. [link](https://stackoverflow.com/questions/25705287/floats-smaller-than-flt-min-why-flt-true-min) – Jonah Bader Dec 05 '20 at 22:10
  • @largest_prime_is_463035818 I should have clarified. My goal at the time was just to store the value as a float, I never tried printing it. Using "float temp = -8.34416e-46" does not cause my program to crash while the stof method crashes. – Jonah Bader Dec 05 '20 at 22:15

1 Answers1

4

An Alternative

Convert to double with std::stod and then assign to float. Add bounds checking if desired.

Converting “-8.34416e-46”

The C++ standard allows conversions of strings to float to report underflow if the result is in the subnormal range even though it is representable.

When rounding to the nearest representable value is used, −8.34416•10−46 is within the range of float (in C++ implementations that use IEEE-754 binary32 for float, which is common), but it is in the subnormal range. The C++ standard says stof calls strtof and then defers to the C standard to define strtof. The C standard indicates that strtof may underflow, about which it says “The result underflows if the magnitude of the mathematical result is so small that the mathematical result cannot be represented, without extraordinary roundoff error, in an object of the specified type.” That is awkward phrasing, but it refers to the rounding errors that occur when subnormal values are encountered. (Subnormal values are subject to larger relative errors than normal values, so their rounding errors might be said to be extraordinary.)

Thus, a C++ implementation is allowed by the C++ standard to underflow for subnormal values even though they are representable.

The smallest positive magnitude in the binary32 format is 2−149, about 1.4•10−45. 8.34416•10−46 is smaller than this, but it is greater than half of 2−149. That means, between 0 and 2−149, it is closer to the latter, so conversion with rounding to the nearest representable value will produce 2−149 rather than zero. Unfortunately, your strtof implementation chooses to report underflow rather than completing a conversion to the nearest representable value.

Normal and subnormal values

For IEEE-754 32-bit binary floating-point, the normal range is from 2-126 to 2128-2104. Within this range, every number is represented with a signficand (the fraction portion of the floating-point representation) that has a leading 1 bit followed by 23 additional bits, and so the error that occurs when rounding any real number in this range to the nearest representable value is at most 2-24 times the position value of the leading bit.

In additional to this normal range, there is a subnormal range from 2−149 to 2−126−2−149. In this interval, the exponent part of the floating-point format has reached its smallest value and cannot be decreased any more. To represent smaller and smaller numbers in this interval, the significand is reduced below the normal minimum of 1. It starts with a 0 and is followed by 23 additional bits. In this interval, the error that occurs when rounding a real number to the nearest representable value may be larger than 2-24 times the position value of the leading bit. Since the exponent cannot be decreased any further, numbers in this interval have increasing numbers of leading 0 bits as they get smaller and smaller. Thus the relative errors involved with using these numbers grows.

For whatever reasons, the C++ has said that implementations may report underflow in this interval. (The IEEE-754 standard defines underflow in complicated ways and also allows implementations some choices.)

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312