An Alternative
Convert to double
with std::stod
and then assign to float
. Add bounds checking if desired.
Converting “-8.34416e-46”
The C++ standard allows conversions of strings to float
to report underflow if the result is in the subnormal range even though it is representable.
When rounding to the nearest representable value is used, −8.34416•10−46 is within the range of float
(in C++ implementations that use IEEE-754 binary32 for float
, which is common), but it is in the subnormal range. The C++ standard says stof
calls strtof
and then defers to the C standard to define strtof
. The C standard indicates that strtof
may underflow, about which it says “The result underflows if the magnitude of the mathematical result is so small that the mathematical result cannot be represented, without extraordinary roundoff error, in an object of the specified type.” That is awkward phrasing, but it refers to the rounding errors that occur when subnormal values are encountered. (Subnormal values are subject to larger relative errors than normal values, so their rounding errors might be said to be extraordinary.)
Thus, a C++ implementation is allowed by the C++ standard to underflow for subnormal values even though they are representable.
The smallest positive magnitude in the binary32 format is 2−149, about 1.4•10−45. 8.34416•10−46 is smaller than this, but it is greater than half of 2−149. That means, between 0 and 2−149, it is closer to the latter, so conversion with rounding to the nearest representable value will produce 2−149 rather than zero. Unfortunately, your strtof
implementation chooses to report underflow rather than completing a conversion to the nearest representable value.
Normal and subnormal values
For IEEE-754 32-bit binary floating-point, the normal range is from 2-126 to 2128-2104. Within this range, every number is represented with a signficand (the fraction portion of the floating-point representation) that has a leading 1 bit followed by 23 additional bits, and so the error that occurs when rounding any real number in this range to the nearest representable value is at most 2-24 times the position value of the leading bit.
In additional to this normal range, there is a subnormal range from 2−149 to 2−126−2−149. In this interval, the exponent part of the floating-point format has reached its smallest value and cannot be decreased any more. To represent smaller and smaller numbers in this interval, the significand is reduced below the normal minimum of 1. It starts with a 0 and is followed by 23 additional bits. In this interval, the error that occurs when rounding a real number to the nearest representable value may be larger than 2-24 times the position value of the leading bit. Since the exponent cannot be decreased any further, numbers in this interval have increasing numbers of leading 0 bits as they get smaller and smaller. Thus the relative errors involved with using these numbers grows.
For whatever reasons, the C++ has said that implementations may report underflow in this interval. (The IEEE-754 standard defines underflow in complicated ways and also allows implementations some choices.)