3

I have a puzzling error on my hands. I am sure this code worked fine in an earlier version of boost, now (boost 1.72.0) it chucks an exception:

string problemStr = "1.03964e-312";
double problemChild = boost::lexical_cast<double>(problemStr);

Setting a breakpoint in boost's code:

namespace boost 
{
    template <typename Target, typename Source>
    inline Target lexical_cast(const Source &arg)
    {
        Target result = Target();

        if (!boost::conversion::detail::try_lexical_convert(arg, result)) {
            boost::conversion::detail::throw_bad_cast<Source, Target>();
        }

        return result;
    }

at the line boost::conversion::detail::throw_bad_cast<Source, Target>(); reveals, that while the value is actually converted to double (result=1.0396399999979624E-312) the test boost::conversion::detail::try_lexical_convert(arg, result) failed! This then results in the exception:

  boost::wrapexcept<boost::bad_lexical_cast>: bad lexical cast: source type value could not be interpreted as target

I'm confused. It seems to do the conversion but still throws the exception? What am I overlooking? Or is this actually a bug?

Jürgen Simon
  • 876
  • 1
  • 12
  • 35

1 Answers1

4

That's confusing.

Couldn't repro it at first: https://wandbox.org/permlink/MWJ3Ys7iUhNIaBek - you can change compiler versions and boost version there

However, changing the compiler to clang did the trick: https://wandbox.org/permlink/Ml8lQWESprfEplBi (even with boost 1.73)

Things get weirder: on my box, clang++-9 is fine even with asan/ubsan.

So I took to installing a few docker distributions.

It turns out that when using clagn++ -stdlib=libc++ things break.

Conclusion

It's not that complicated after a long chase down debuggers and standard library implementations. Here's the low-down:

#include <sstream>
#include <cassert>
#include <iostream>

int main() {
    double v;
    std::cout << std::numeric_limits<double>::min_exponent10 << std::endl;
    std::cout << std::numeric_limits<double>::max_exponent10 << std::endl;
    assert(std::istringstream("1e308") >> v);
    assert(std::istringstream("1.03964e-312") >> v); // line 10
    assert(std::istringstream("1e309") >> v); // line 11
}

On libstdc++ prints:

-307
308
sotest: /home/sehe/Projects/stackoverflow/test.cpp:11: int main(): Assertion `std::istringstream("1e309") >> v' failed.

On libc++:

-307
308
sotest: /home/sehe/Projects/stackoverflow/test.cpp:10: int main(): Assertion `std::istringstream("1.03964e-312") >> v' failed.

Summarizing, libstdc++ is allowing subnormal representations in some cases:

The 11 bit width of the exponent allows the representation of numbers between 10−308 and 10308, with full 15–17 decimal digits precision. By compromising precision, the subnormal representation allows even smaller values up to about 5 × 10−324.

It is likely that the library does do some checks to find whether there is acceptable loss of precision, but it could also be leaving this entirely to your own judgment.

Suggestions

If you need that kind of range, I'd suggest using a multiprecision library (GMP, MPFR, or indeed Boost).

For full fidelity with decimal input formats, consider e.g. cpp_dec_float:

#include <boost/multiprecision/cpp_dec_float.hpp>
using Decimal = boost::multiprecision::cpp_dec_float_50;

int main() {
    Decimal v("1.03964e-312");
    std::cout << v << std::endl;
}

Prints

1.03964e-312
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Thanks a lot for this exhaustive answer. I find the wandbox tool immensely useful, also thank you for this. I am using clang 10.0.0 indeed (something I failed to mention). About your proposal to use Decimal: a good idea. Unfortunately it is not really an option for me. My code is heavy on templates and should work with all sorts of numerical types. For now I am working around the problem by setting value like those to 0. They are themselves no doubt the result of a rounding error, I will pursue the cause and try to eliminate that problem there. – Jürgen Simon Jun 24 '20 at 15:39
  • 1
    "All sorts of numerical types" They work! However, I recommend [turning off expression templates at the cost of some raw performance](https://stackoverflow.com/a/42093422/85371), because they do tend to break ("surprise") generic code. Indeed it does sound like accuracy is not really a goal since your input is already inexact, so it's probably good for you to using doubles. You could mix/match if you find that you get "boundary condition inputs" like these and want to be able to diagnose them better. – sehe Jun 24 '20 at 15:57