1

I want to replace the calls to boost::lexical_cast<std::string>(d) with a solution that:

  1. Does not use locales (I suspect it to be the cause of slow-down in multi-threaded apps),
  2. Preserves the same output as lexical_cast.

I am using a generator written in Boost.Spirit.Karma (because it is faster). But after the change the results are different, because Karma has its own way of displaying fractional parts of doubles.

I know that to some extent one can control the double generator with policies, but my attempts fail. The best I can come up with is this generator customization and this test program:

#include <limits>
#include <string>
#include <iostream>
#include <boost/lexical_cast.hpp>
#include <boost/spirit/include/karma.hpp>

namespace karma = boost::spirit::karma;

template <typename Num>
struct fp_policy : karma::real_policies<Num>
{
  template <typename OutputIterator>
  static bool dot (OutputIterator& sink, Num n, unsigned /*precision*/)
  {
      if (n)
        return karma::char_inserter<>::call(sink, '.');  // generate the dot by default
      else
        return false;
  }

  static unsigned precision(Num)
  {
      return 15;
  }
};

karma::real_generator<double, fp_policy<double>> const fp_generator {};

std::string karma_to_string(double const& v)
{
    std::string ans;
    std::back_insert_iterator<std::string> sink {ans};
    (void)karma::generate(sink, fp_generator, v);
    return ans;
}

void test_number (double x)
{
  std::cout << "stream:        " << x << "\n";
  std::cout << "lexiical_cast: " << boost::lexical_cast<std::string>(x) << "\n";
  std::cout << "spirit:        " << karma_to_string(x) << "\n";
  std::cout << "--------------------------------------------" << std::endl;
}

int main()
{
  test_number(0.45359237);
  test_number(111.11);
  test_number(1.0);
  test_number(3.25);
}

And it gives the following output:

stream:        0.453592
lexiical_cast: 0.45359237000000002
spirit:        0.45359237
--------------------------------------------
stream:        111.11
lexiical_cast: 111.11
spirit:        111.109999999999999
--------------------------------------------
stream:        1
lexiical_cast: 1
spirit:        1
--------------------------------------------
stream:        3.25
lexiical_cast: 3.25
spirit:        3.25
--------------------------------------------

And as you can see there are obvious differences. If I go with the default generator for doubles (karma::double_), results still vary but in different places:

stream:        0.453592
lexiical_cast: 0.45359237000000002
spirit:        0.454
--------------------------------------------
stream:        111.11
lexiical_cast: 111.11
spirit:        111.11
--------------------------------------------
stream:        1
lexiical_cast: 1
spirit:        1.0
--------------------------------------------
stream:        3.25
lexiical_cast: 3.25
spirit:        3.25
--------------------------------------------

My question: how to configure the generator for doubles (or if it is even possible?) so that the output is closer to stream based converters?

Andrzej
  • 5,027
  • 27
  • 36
  • 3
    this seems like an XY problem. I think that in general it's a bad ideea to rely on the user-friendly representation of floating points. For serialization look into `frexp` and `ldexp` or maybe hex float format. – bolov Mar 22 '18 at 15:56
  • @bolov: maybe it is an XY. My goal is to replace `lexical_cast` with something that is faster and does not use locale, but preserves the same output. – Andrzej Mar 23 '18 at 09:31

1 Answers1

0

I'd create a generator based on sprintf, like I reverse-engineered here:

That answer comes with a full test-suite to verify "lexical-cast compliance" for values including NaN, positive/negative zero and infinities.

If you want I can think up a demo. By far the simplest thing to do would seem to wrap the value in a type with overloaded output streaming operations, so that the boost::spirit::karma::stream directive can be used.

Alternatively, you can create a custom generator type (satisfying the concept requirements).

sehe
  • 374,641
  • 47
  • 450
  • 633
  • I actually didn't say it in my question, but my goal is to avoid a solution that is using locales (as I suspect this to be the source of slowdown in multi-threaded apps). – Andrzej Mar 23 '18 at 09:33
  • 1
    After the edit in your question, I'm confused. Which is it? Do you not want to use locales (1.) **or** do you want to have the same behaviour as `boost::lexical_cast` (2.)? You can't have both, as the test cases in the answer I linked clearly show. I can /imagine/ ways to resolve the conflicts, but I'd rather wait for you to clarify the requirements before I invest more time. – sehe Mar 23 '18 at 18:30
  • thank you for asking the right questions. They make me think more clearly. I want to drop `lexical_cast` because it is too slow, most probably to locales. But I cannot risk changing the behavior. You are right, locales configure the behavior, but somehow, the default locale on all Unix machines are tested were always the same. I thought that the default locale does something quite common, and that someone has already done the output identical to using the default locale. DOes that make sense? – Andrzej Mar 24 '18 at 20:52
  • The "default locale" will depend on the environment. If we replace _that_ with "C locale" (aka classic), that's a good constraint I think. Thinking about it – sehe Mar 26 '18 at 23:08