4

I'm working with a library that unfortunately uses boost::lexical_cast to convert from a double to a string.

I need to be able to definitively mirror that behavior on my side, but I was hopping to do so without propagating boost.

Could I be guaranteed identical behavior using to_string, sprintf, or some other function contained within the standard?

sehe
  • 374,641
  • 47
  • 450
  • 633
Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288
  • 1
    Why is it 'unfortunate'? – SergeyA Jan 03 '18 at 16:01
  • @SergeyA It's unfortunate cause I can't find any documentation on the exact behavior of `boost::lexical_cast` and because I don't want to have to further clutter by continuing to pull in `boost`. – Jonathan Mee Jan 03 '18 at 16:39
  • What is exactly the problem with using boost? Obviously project already uses it? Apparently boost lexical cast doesn't commit to a certain way of string representation, and as such, is prone to be changed across versions. – SergeyA Jan 03 '18 at 16:43
  • @SergeyA One of the *libraries* I use pulls in Boost. (Obviously I can see into the source of this library as I know it uses `boost::lexical_cast` to do this conversion.) There's an optimization opportunity for me to do this `double` to `string` conversion on my side but the resulting `string` must be consistent with the `string` that the library would have generated. – Jonathan Mee Jan 03 '18 at 16:48
  • (Depending on the nature of the library, perhaps you can get away with formatting the numbers to string before sharing them. Of course, in many cases that wouldn't be an option) – sehe Jan 03 '18 at 21:31

2 Answers2

5

The boost code ends up here:

            bool shl_real_type(double val, char* begin) {
                using namespace std;
                finish = start +
#if defined(_MSC_VER) && (_MSC_VER >= 1400) && !defined(__SGI_STL_PORT) && !defined(_STLPORT_VERSION)
                    sprintf_s(begin, CharacterBufferSize,
#else
                    sprintf(begin, 
#endif
                    "%.*g", static_cast<int>(boost::detail::lcast_get_precision<double>()), val);
                return finish > start;
            }

You're in luck since the precision is USUALLY compile-time constant (unless boost configures BOOST_LCAST_NO_COMPILE_TIME_PRECISION).

Simplifying a bit and allowing for conforming, modern standard libraries:

Mimicking Boost Lexicalcast

#include <cstdio>
#include <limits>
#include <string>

namespace {
    template <class T> struct lcast_precision {
        typedef std::numeric_limits<T> limits;

        static constexpr bool use_default_precision  = !limits::is_specialized || limits::is_exact;
        static constexpr bool is_specialized_bin     = !use_default_precision && limits::radix == 2 && limits::digits > 0;

        static constexpr bool is_specialized_dec     = !use_default_precision && limits::radix == 10 && limits::digits10 > 0;
        static constexpr unsigned int precision_dec  = limits::digits10 + 1U;
        static constexpr unsigned long precision_bin = 2UL + limits::digits * 30103UL / 100000UL;

        static constexpr unsigned value = is_specialized_bin 
            ? precision_bin 
            : is_specialized_dec? precision_dec : 6;
    };

    std::string mimicked(double v) {
        constexpr int prec = static_cast<int>(lcast_precision<double>::value);

        std::string buf(prec+10, ' ');
        buf.resize(sprintf(&buf[0], "%.*g", prec, v));
        return buf;
    }
}

Regression Tests

To compare the results and check the assumptions:

Live On Coliru

#include <cstdio>
#include <limits>
#include <string>

namespace {
    template <class T> struct lcast_precision {
        typedef std::numeric_limits<T> limits;

        static constexpr bool use_default_precision  = !limits::is_specialized || limits::is_exact;
        static constexpr bool is_specialized_bin     = !use_default_precision && limits::radix == 2 && limits::digits > 0;

        static constexpr bool is_specialized_dec     = !use_default_precision && limits::radix == 10 && limits::digits10 > 0;
        static constexpr unsigned int precision_dec  = limits::digits10 + 1U;
        static constexpr unsigned long precision_bin = 2UL + limits::digits * 30103UL / 100000UL;

        static constexpr unsigned value = is_specialized_bin 
            ? precision_bin 
            : is_specialized_dec? precision_dec : 6;
    };

    std::string mimicked(double v) {
        constexpr int prec = static_cast<int>(lcast_precision<double>::value);

        std::string buf(prec+10, ' ');
        buf.resize(sprintf(&buf[0], "%.*g", prec, v));
        return buf;
    }
}

#include <cmath>
#include <iomanip>
#include <iostream>
#include <string>

#include <boost/lexical_cast.hpp>

#ifdef BOOST_LCAST_NO_COMPILE_TIME_PRECISION
#error BOOM
#endif

#define TEST(x)                                                                                                        \
    do {                                                                                                               \
        std::cout << std::setw(45) << #x << ":\t" << (x) << "\n";                                                      \
    } while (0)

std::string use_sprintf(double v) {
    std::string buf(32, ' ');
    buf.resize(std::sprintf(&buf[0], "%f", v));
    return buf;
}

void tests() {
    for (double v : {
            std::numeric_limits<double>::quiet_NaN(),
            std::numeric_limits<double>::infinity(),
           -std::numeric_limits<double>::infinity(),
            0.0,
           -0.0,
            std::numeric_limits<double>::epsilon(),
            M_PI })
    {
        TEST(v);
        TEST(std::to_string(v));
        TEST(use_sprintf(v));
        TEST(boost::lexical_cast<std::string>(v));
        TEST(mimicked(v));

        assert(mimicked(v) == boost::lexical_cast<std::string>(v));
    }
}

static std::locale DE("de_DE.utf8");

int main() {

    tests();

    std::cout << "==== imbue std::cout\n";
    std::cout.imbue(DE);

    tests();

    std::cout << "==== override global locale\n";
    std::locale::global(DE);

    tests();
}

Prints

                                        v:  nan
                        std::to_string(v):  nan
                           use_sprintf(v):  nan
      boost::lexical_cast<std::string>(v):  nan
                              mimicked(v):  nan
                                        v:  inf
                        std::to_string(v):  inf
                           use_sprintf(v):  inf
      boost::lexical_cast<std::string>(v):  inf
                              mimicked(v):  inf
                                        v:  -inf
                        std::to_string(v):  -inf
                           use_sprintf(v):  -inf
      boost::lexical_cast<std::string>(v):  -inf
                              mimicked(v):  -inf
                                        v:  0
                        std::to_string(v):  0.000000
                           use_sprintf(v):  0.000000
      boost::lexical_cast<std::string>(v):  0
                              mimicked(v):  0
                                        v:  -0
                        std::to_string(v):  -0.000000
                           use_sprintf(v):  -0.000000
      boost::lexical_cast<std::string>(v):  -0
                              mimicked(v):  -0
                                        v:  2.22045e-16
                        std::to_string(v):  0.000000
                           use_sprintf(v):  0.000000
      boost::lexical_cast<std::string>(v):  2.2204460492503131e-16
                              mimicked(v):  2.2204460492503131e-16
                                        v:  3.14159
                        std::to_string(v):  3.141593
                           use_sprintf(v):  3.141593
      boost::lexical_cast<std::string>(v):  3.1415926535897931
                              mimicked(v):  3.1415926535897931
==== imbue std::cout
                                        v:  nan
                        std::to_string(v):  nan
                           use_sprintf(v):  nan
      boost::lexical_cast<std::string>(v):  nan
                              mimicked(v):  nan
                                        v:  inf
                        std::to_string(v):  inf
                           use_sprintf(v):  inf
      boost::lexical_cast<std::string>(v):  inf
                              mimicked(v):  inf
                                        v:  -inf
                        std::to_string(v):  -inf
                           use_sprintf(v):  -inf
      boost::lexical_cast<std::string>(v):  -inf
                              mimicked(v):  -inf
                                        v:  0
                        std::to_string(v):  0.000000
                           use_sprintf(v):  0.000000
      boost::lexical_cast<std::string>(v):  0
                              mimicked(v):  0
                                        v:  -0
                        std::to_string(v):  -0.000000
                           use_sprintf(v):  -0.000000
      boost::lexical_cast<std::string>(v):  -0
                              mimicked(v):  -0
                                        v:  2,22045e-16
                        std::to_string(v):  0.000000
                           use_sprintf(v):  0.000000
      boost::lexical_cast<std::string>(v):  2.2204460492503131e-16
                              mimicked(v):  2.2204460492503131e-16
                                        v:  3,14159
                        std::to_string(v):  3.141593
                           use_sprintf(v):  3.141593
      boost::lexical_cast<std::string>(v):  3.1415926535897931
                              mimicked(v):  3.1415926535897931
==== override global locale
                                        v:  nan
                        std::to_string(v):  nan
                           use_sprintf(v):  nan
      boost::lexical_cast<std::string>(v):  nan
                              mimicked(v):  nan
                                        v:  inf
                        std::to_string(v):  inf
                           use_sprintf(v):  inf
      boost::lexical_cast<std::string>(v):  inf
                              mimicked(v):  inf
                                        v:  -inf
                        std::to_string(v):  -inf
                           use_sprintf(v):  -inf
      boost::lexical_cast<std::string>(v):  -inf
                              mimicked(v):  -inf
                                        v:  0
                        std::to_string(v):  0,000000
                           use_sprintf(v):  0,000000
      boost::lexical_cast<std::string>(v):  0
                              mimicked(v):  0
                                        v:  -0
                        std::to_string(v):  -0,000000
                           use_sprintf(v):  -0,000000
      boost::lexical_cast<std::string>(v):  -0
                              mimicked(v):  -0
                                        v:  2,22045e-16
                        std::to_string(v):  0,000000
                           use_sprintf(v):  0,000000
      boost::lexical_cast<std::string>(v):  2,2204460492503131e-16
                              mimicked(v):  2,2204460492503131e-16
                                        v:  3,14159
                        std::to_string(v):  3,141593
                           use_sprintf(v):  3,141593
      boost::lexical_cast<std::string>(v):  3,1415926535897931
                              mimicked(v):  3,1415926535897931

Note that mimicked and boost::lexical_cast<std::string>(double) result in exactly the same output each time.

sehe
  • 374,641
  • 47
  • 450
  • 633
  • In the interest of complete overkill, added test cases including NaN, +Inf, -Inf, +0, -0, epsilon and M_PI that `assert`s that output of `mimicked(v)` is always identical to `boost::lexical_cast(v)` under all locale combinations. – sehe Jan 03 '18 at 21:30
  • 1
    Ugh, and it took me that long to even figure out the first thing that you learned >:( – Jonathan Mee Jan 03 '18 at 21:59
  • I made only the assumption about compile-time precision. I like my version as it clocks in <30 LoC and will do exactly the same on **all** platforms and for **all** real types. – sehe Jan 03 '18 at 22:45
  • 1
    I've accepted your answer cause it's fantastic and I wish I could give it more than one upvote. But just for my own understanding what does "<30 LoC" stand for? – Jonathan Mee Jan 04 '18 at 12:56
  • So, "less than 30 lines of code" (now that I'm at home at a real keyboard). Cheers! – sehe Jan 05 '18 at 00:02
4

So after a few hours of digging around in Boost templates here's what I've learned:

  1. The actual call that does the stringifying is: lexical_cast_do_cast<std::string, double>::lexical_cast_impl
  2. This uses std::sprintf within boost::detail::lexical_stream_limited_src<char, std::char_traits<char>, false>
  3. boost::detail::lexical_stream_limited_src<char, std::char_traits<char>, true>::operator<< will be used to insert the double, passing begin, the pointer to an allocated std::string's buffer, and val, the input double, yields this call: std::sprintf(begin, "%.*g", static_cast<int>(boost::detail::lcast_get_precision<double>()), val)
  4. So the precision field here comes from boost::details::lcast_precision<double>::value, which, will use std::numeric_limits<double>; if it's is_specialized is false, is_exact is false, radix is 2, and digits is greater than 0 then boost::details::lcast_precision<double>::value will evaluate to: 2UL + std::numeric_limits<double>::digits * 30103UL / 100000UL

Thus where begin is the allocated string and val is the input double, boost::lexical_cast<double> yields a final result the equivalent of:

std::sprintf(begin, "%.*g", 2UL + std::numeric_limits<double>::digits * 30103UL / 100000UL, val)

This is obviously heavily implementation dependent. But on my system this will yield the exact equivalent.

Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288
  • Lessons learned, never use Boost, always use C++ standard functionality. – Jonathan Mee Jan 03 '18 at 21:59
  • Yes this is a nice analysis. Have an upvote. and yes, I’ll delete my answer. Although I do believe Boost has its uses: spirit is excellent, as is BLAS, and the date libraries. But yes, adopt stuff as it appears in the standard. – Bathsheba Jan 03 '18 at 22:15
  • My take: Always use Boost, but don't treat it as a silver bullet. If you need to know what you're doing, make sure you know how you do it. It's always like that: companies should never outsource their core business, e.g. – sehe Jan 03 '18 at 22:42
  • @Bathsheba I regret to say that I agree that spirit is an exception, and apparently I need to read up on whatever BLAS is. As far as dates I'd strongly suggest Howard Hinnant's library which has been propsed for addition to the standard: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0355r4.html – Jonathan Mee Jan 04 '18 at 12:52