In the end, the boost documentation does a good job of answering my question, but you have to do some reading, and it helps to understand std::locale
better than I did at the time of posting.
Plays nicely with the std
A std::locale
is a collection of facet
s. The standard defines a set of facets which each locale must provide, but other than that it seems most is left to the implementation. This includes locale behavior, and the names of the locales.
What boost::locale does is provide a bunch of facets, collected into locales, that behave the same way regardless of platform (at least if you are using the default ICU backend).
So boost::locale
provides a standardized set of std::locale's which can behave consistently across platforms, provides full Unicode support for a wide range of cultural norms, and with consistent naming. Switching between use of a non boost std::locale
(i.e. an implementation provided locale) and a boost::locale
is trivial since they are the same types -- both are collections of std::facets
, although implementations are different. Chances are the boost::locale
s do a better job of doing what you want.
Complete Unicode support, for all encodings, on all platforms
Further, boost::locale
provides a way of accessing complete unicode support through ICU, which allows you to gain the benefits of ICU, without the poor (not C++ish) interface of ICU.
This is advantageous, since any standard support of Unicode is very likely to come through the locale frameork, and any unicode aware program is likely going to need to locale aware as well (for collation for example).
Saner behavior regarding numbers
Finally, boost::locale
addresses what could legitimately be called a significant flaw in the usual implementations of the std::locales -- any stream formatted number will be affected by locale, regardless of whether this is desirable -- see the boost documentation for a detailed discussion.
So if you are using an ofstream to read or write a file, and you have set the globale locale
to your platform's german locale, you'll have commas separating the decimal part of your floats. If you're reading/writing a csv file, that might be a problem. If you used a boost::locale
as your global locale, this will only happen if you explicitly tell it to use locale conventions for your numeric input/output. Note that many libraries use locale info in the background, including boost::lexical_cast. So does std::to_string, for that matter. So consider the following example:
std::locale::global(std::locale("de_DE"));
auto demo = [](const std::string& label)
{
std::cout.imbue(std::locale()); // imbue cout with the global locale.
float f = 1234.567890;
std::cout << label << "\n";
std::cout << "\t streamed: " << f << "\n";
std::cout << "\t to_string: " << std::to_string(f) << "\n";
};
std::locale::global(std::locale("C"));//default.
demo("c locale");
std::locale::global(std::locale("de_DE"));//default.
demo("std de locale");
boost::locale::generator gen;
std::locale::global(gen("de_DE.UTF-8"));
demo("boost de locale");
Gives the following output:
c locale
streamed: 1234.57
to_string: 1234.567871
std de locale
streamed: 1.234,57
to_string: 1234,567871
boost de locale
streamed: 1234.57
to_string: 1234,567871
In code that implements both human communication (output to gui or terminal) and inter-machine communication (csv files, xml, etc) this is likely undesireable behavior. When using a boost locale, you explicitly specify when you want locale formatting, ala:
cout << boost::locale::as::currency << 123.45 << "\n";
cout << boost::locale::as::number << 12345.666 << "\n"
Conclusion
It would seem that boost::locale's should be preferred over the system provided locales.