conversion of double to string to double throws exception

Question

The following code throws an std::out_of_range exception in Visual Studio 2013 where in my opinion it shouldn't:

#include <string>
#include <limits>

int main(int argc, char ** argv)
{
    double maxDbl = std::stod(std::to_string(std::numeric_limits<double>::max()));

    return 0;
}

I tested the code also with gcc 4.9.2 and there it does not throw an exception. The issue seems to be caused by an inaccurate string representation after the conversion to string. In Visual Studio std::to_string(std::numeric_limits<double>::max()) yields

179769313486231610000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.000000

which indeed seems too large. In gcc, however, it yields

179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000

which seems to be smaller than the passed value.

However, isn't std::numeric_limits<double>::max() supposed to return the

maximum finite representable floating-point number?

So why do the string representations get off? What am I missing here?

[to_string](http://en.cppreference.com/w/cpp/string/basic_string/to_string) is equivalent to some form of `sprintf` and so seems like it would be subject to the same issues as in [this question](http://stackoverflow.com/q/31142600/1708801) — Shafik Yaghmour, Jul 27 '15 at 12:32
But `std::numeric_limits::max()` is supposed to be exactly representable in binary so that no rounding should be necessary, doesn't it? — sigy, Jul 27 '15 at 12:37
@sigy: That doesn't automatically mean it's exactly representable in decimal, of course. — Lightness Races in Orbit, Jul 27 '15 at 12:39
Ok, true. But in case of Visual Studio and my current platform it returns 1.7976931348623158e+308 which should be exactly representable in decimal as well. — sigy, Jul 27 '15 at 12:44
What VS are you using? In 2015 I get `1.7976931348623157e+308` for the double value and `179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000` for the string value — NathanOliver, Jul 27 '15 at 12:46
@lightness: it's an integer, albeit a very large one, so it must be exactly representable in any base. — rici, Jul 27 '15 at 13:26
@NathanOliver As stated I am using VS 2013. Interesting that it changed in VS 2015. However, even if VS 2015 behaves like GCC 4.9.2, "rounds" down and thus does not throw an exception, the question why any rounding takes place remains. — sigy, Jul 27 '15 at 13:29
@rici : unless you can split bits, you will never be able to represent exactly a number of 308 decimal digits with 64 bits ! — Serge Ballesta, Jul 27 '15 at 16:18
@SergeBallesta: That isn't the question. The question is whether you can represent it precisely as an ascii string of decimal digits. No conversion to a 64-bit integer is implied by the code presented. — rici, Jul 27 '15 at 16:21
@SergeBallesta: To be fair, the standard doesn't guarantee the accuracy of `std::to_string(double)` and even the recommended practice doesn't require it to produce more than `DECIMAL_DIG` leading significant digits. But that's independent of the question of whether it can be "precisely representable in decimal". It can be, and the Gnu standard library (for example) does so. — rici, Jul 27 '15 at 16:30
@rici : when you write `double d = std::stod("...")` you do convert the value to a 64 bit IEEE754 floating point on any common compiler. And even if the standard to not require IEEE754, floating point values do not need to exactly represent all **integer** values up to `std::numeric_limits::max()` — Serge Ballesta, Jul 27 '15 at 17:00
@SergeBallesta: The C floating point model (see 5.2.4.2.2) requires that floating point numbers have the form f·b^p where f is a fixed-length integer in base b and p is a possibly negative integer; that's sufficient to show that all floating point numbers greater than some value must be integers. C++ doubles don't have to be the same as C doubles, but it would be surprising if they weren't. — rici, Jul 27 '15 at 17:49
@SergeBallesta: The claim of LRIO that I was reacting to was that there is no guarantee that `std::numeric_limits::max()` is "exactly representable in decimal". My only claim is that std::numeric_limits::max() *is* exactly representable in decimal. I'm not claiming anything about every decimal integer. All I'm claiming is that for each floating point type, there is some representable value _n_ such that every representable value _v_ >= _n_ is an integer, and consequently is "exactly representable" in base b for any integer value b > 1. — rici, Jul 27 '15 at 17:53
@rici : I've got your point. My remark was just that it was not a bijection. IEEE754 does give one (big) integer value for `std::numeric_limits::max()`, but many integers close to it, will also be representated by the same double value. — Serge Ballesta, Jul 27 '15 at 21:24
VC++ used to only look at 16 or 17 sig figs, which isn't quite standards compliant. The attempted to fix this in VS 2015. http://www.exploringbinary.com/visual-c-plus-plus-strtod-still-broken/ — Adrian McCarthy, Jul 27 '15 at 22:36
It should be noted that in VS2008, `printf("%d\n", max)` gives a correct string (even if rounded at 16 digits), but `cout << setiosflags(ios_base::fixed) << max << endl;` gives another (wrong) value ... buggy ? — Serge Ballesta, Jul 28 '15 at 09:20
@SergeBallesta: It's fairly trivial to represent a 308 digit number in 64 bits. It's impossible to represent **all** 308 digit numbers as 64 bits. Here, we know a priori that the 308 digit number fits in 64 bits, since the 308 digit number was the decimal expansion of a 64 bit value to start with. — MSalters, Jul 28 '15 at 12:00

score 1 · Answer 1 · edited Jun 20 '20 at 09:12

Direct answer

Gcc (and Clang and VS2105) correctly return the integer value of (2¹⁰²⁴ - 1) - (2^1024-53 - 1) that is what is represented with 52 one bits of significand and an unbiased exponent of 1023 (2¹⁰²⁴ - 1 would be the integer value with 1023 one bits, and I just substract all the bits below the 52 of the IEE754 format)

I can confirm that a large integer library give 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368L

The previous exact floating point would be 2⁹⁷¹ lesser (971 = 1023 - 52) that is : 179769313486231550856124328384506240234343437157459335924404872448581845754556114388470639943126220321960804027157371570809852884964511743044087662767600909594331927728237078876188760579532563768698654064825262115771015791463983014857704008123419459386245141723703148097529108423358883457665451722744025579520L

The next non representable value would be 2⁹⁷¹ greater that is: 179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216L

But the value used by MSVC2013 and previous is near to 2¹⁰²⁴ + 2⁹⁷¹, that is : 179769313486231610731333614426100589925524828262616317947942685512308090830973387504827396012048193870699768806228404251083258210739369062217227314575410731769485876273179688476358949112102859294830297395714877595371718127781702814782017661749531126051903195165027873311156314696040132728420308633064323416064L . As it is greater than any value representable in IEEE754 double precision, it cannot be decoded to a double.

Because at most, one could say that any value between 2¹⁰²⁴ - 2⁹⁷¹ (std::numeric_limits<double>::max()) and 2¹⁰²⁴ could be rounded to std::numeric_limits<double>::max(), but values greater than 2¹⁰²⁴ are clearly an overflow.

Discussion on accuracy

Only 16 decimal digits are accurate in a double and all other digits can be seen as garbage or random values since they do not depend on the value itself but only one the way you choose to calculate them. Just try to substract 1e+288 (that's already a big value) to maxDbl and look what happens :

maxLess = max Dbl - 1.e+288;
if (maxLess == maxDbl) {
   std::cout << "Unchanged" << std::endl;
}
else std::cout << "Changed" << std::endl;

You should see ... Unchanged.

It just looks like VS 2013 is a little incoherent in the way it rounds floating point values : it rounded maxDbl by excess to one bit higher than the maximum actually representable value, and could not decode it later.

The problem is that the standard choosed to use a %f format which gives a false sentiment of accuracy. If you want to see an equivalent problem in gcc, just use :

#include <iostream>
#include <string>
#include <limits>
#include <iomanip>
#include <sstream>

int main() {
    double max = std::numeric_limits<double>::max();
    std::ostringstream ostr;
    ostr << std::setprecision(16) << max;
    std::string smax = ostr.str();
    std::cout << smax << std::endl;
    double m2 = std::stod(smax);
    std::cout << m2 << std::endl;

    return 0;
}

Rounded to 16 digits mxDbl writes (correctly) : 1.797693134862316e+308, but can no longer be decoded back

And this one :

#include <iostream>
#include <string>
#include <limits>

int main() {
    double maxDbl = std::numeric_limits<double>::max();
    std::string smax = std::to_string(maxDbl);
    std::cout << smax << std::endl;
    
    std::string smax2 = "179769313486231570800000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.000000";

    double max2 = std::stod(smax2);
    if (max2 == maxDbl) {
       std::cout << smax2 << " is same double as " << smax << std::endl;
    }

    return 0;
}

Displays :

179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000
179769313486231570800000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.000000 is same double as 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000

TL/DR : What I mean is that one big enoudh double value can of course be represented by an exact integer (per IEEE754). But it does represent all integers between half to the previous one and half to the next one. So any integer in that range could be an acceptable representation for the double, and one value rounded at 16 decimal digits should be acceptable, but current standard libraries only allow max floating point value to be truncated at 16 decimal digits. But VS2013 gave a number above the max of the range what was in any case an error.

Reference

IEEE floating point on wikipedia

I don't get your point. Of course subtracting 1e+288 does not change the value because rounding will be involved. The next smallest number which is exactly representable should be `1.7976931348623155e+308` which is `DBL_MAX - 2e+292` Also DBL_MAX is `1.7976931348623158e+308` in my case. It has exactly the 16 digits you mentioned. Represented in IEEE754 this should be `0111111111101111111111111111111111111111111111111111111111111111` so it should be representable by a double precision floating point number. Why should there be any rounding involved? — sigy, Jul 27 '15 at 17:39
I would agree that `1.7976931348623158e+308` is a valid floating point representation of `1.7976931348623158e+308 - 1e+288`. However, the string representations returned by `to_string` in VS2013 and GCC 4.9.2 are, in my opinion, not. And I don't see any reason why such inaccuracy should happen. The return values truncated after 16 digits would be still off from what was passed to the method. Does the C++ standard allow such inaccuracies? If so, why? — sigy, Jul 27 '15 at 22:11

conversion of double to string to double throws exception

1 Answers1

Direct answer

Discussion on accuracy

Reference