29

Before C++17, there existed a variety of methods to convert integers, floats, and doubles to and from strings. For example, std::stringstream, std::to_string, std::atoi, std::stoi, and others could have been used to accomplish these tasks. To which, there exists plenty of posts discussing the differences between those methods.

However, C++ 17 has now introduced std::from_chars and std::to_chars. To which, I'd like to know the reasons for introducing another means of converting to and from strings.

For one, what advantages and functionality do these new functions provide over the previous methods?

Not only that, but are there any notable disadvantages for this new method of string conversion?

Rabster
  • 933
  • 1
  • 7
  • 11

3 Answers3

37

std::stringstream is the heavyweight champion. It takes into consideration things like the the stream's imbued locale, and its functionality involves things like constructing a sentry object for the duration of the formatted operation, in order to deal with exception-related issues. Formatted input and output operations in the C++ libraries have some reputation for being heavyweight, and slow.

std::to_string is less intensive than std::istringstream but it still returns a std::string, whose construction likely involves dynamic allocation (less likely with modern short string optimization techniques, but still likely). And, in most cases the compiler still needs to generate all the verbiage, at the call site, to support a std::string object, including its destructor.

std::to_chars are designed to have as little footprint as possible. You provide the buffer, and std::to_chars does very little beyond actually formatting the numeric value into the buffer, in a specific format, without any locale-specific considerations, with the only overhead of making sure that the buffer is big enough. Code that uses std::to_chars does not need to do any dynamic allocation.

std::to_chars is also a bit more flexible in terms of formatting options, especially with floating point values. std::to_string has no formatting options.

std::from_chars is, similarly, a lightweight parser, that does not need to do any dynamic allocation, and does not need to sacrifice any electrons to deal with locale issues, or overhead of stream operations.

Sam Varshavchik
  • 114,536
  • 5
  • 94
  • 148
  • 4
    `std::to_string` also observes the current locale. So I'd say it's not less, but actually more flexible as far as formatting is concerned… – Michael Kenzel Apr 26 '19 at 23:28
25

to/from_chars are designed to be elementary string conversion functions. They have two basic advantages over the alternatives.

  1. They are much lighter weight. They never allocate memory (you allocate memory for them). They never throw exceptions. They also never look at the locale, which also improves performance.

    Basically, they are designed such that it is impossible to have faster conversion functions at an API level.

    These functions could even be constexpr (they aren't, though I'm not sure why), while the more heavyweight allocating and/or throwing versions can't.

  2. They have explicit round-trip guarantees. If you convert a float/double to a string (without a specified precision), the implementation is required to make it so that taking that exact sequence of characters and converting it back into a float/double will produce a binary-identical value. You won't get that guarantee from snprintf, stringstream or to_string/stof.

    This guarantee is only good however if the to_chars and from_chars calls are using the same implementation. So you can't expect to send the string across the Internet to some other computer that may be compiled with a different standard library implementation and get the same float. But it does give you on-computer serialization guarantees.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • Why wouldn't `to/from_chars` have round-trip guarantees across implementations? If we assume ieee754, do we then get the round-trip guarantee across implementations? – Justin Aug 20 '19 at 00:51
  • @Justin: Even if you assume IEEE (which the standard *does not*), you would still have to specify exactly how rounding would work. Which could get in the way of a higher-performance implementation for a particular piece of hardware. – Nicol Bolas Aug 20 '19 at 01:48
  • 1
    This round_trip conversion property is what we needed for years. Only having guarantees to deal with own implementation results is a bit of a downer when you work with XML or JSON interchange across the lines though. I profiled 'from_chars' and it was about 10 times faster than using Boost's lexical_cast with classic locale set each time (since some libraries change it without restoring it and thereby destroying any locale independent storage). – gast128 Feb 10 '20 at 22:56
  • @gast128: The thing is, you can always extract a specific implementation into your application (or use a publicly available one) which would give you all of the performance benefits while maintaining the same API. – Nicol Bolas Feb 11 '20 at 04:44
4

All these pre-existing methods were bound to work based on a so-called locale. A locale is basically a set of formatting options that specify, e.g., what characters count as digits, what symbol to use for the decimal point, what thousand's separator to use, and so on. Very often, however, you don't really need that. If you're just, e.g., reading a JSON file, you know the data is formatted in a particular way, there is no reason to be looking up whether a '.' should be a decimal point or not every time you see one. The new functions introduced in <charconv> are basically hardcoded to read and write numbers based on the formatting laid out for the default C locale. There is no way to change the formatting, but since the formatting doesn't have to be flexible, they can be very fast…

Michael Kenzel
  • 15,508
  • 2
  • 30
  • 39