The task at hand
I'm parsing a filename from an UTF-8 encoded XML on Windows. I need to pass that filename on to a function that I can't change. Internally it uses _fsopen()
which does not support Unicode strings.
Current approach
My current approach is to convert the filename to the user's charset hoping that the filename is representable in that encoding. I'm then using boost::locale::conv::from_utf()
to convert from UTF-8 and I'm using boost::locale::util::get_system_locale()
to get the name of the current locale.
Life is good?
I'm on a German system using code page Windows-1252 thus get_system_locale()
correctly yields de_DE.windows-1252. If I test the approach with a filename containing an umlaut everything works as expected.
The Problem
Just to make sure I switched my system locale to Ukrainian which uses code page Windows-1251. Using some Cyrillic letter in the filename my approach fails. The reason is that get_system_locale()
still yields de_DE.windows-1252 which is now incorrect.
On the other side GetACP()
correctly yields 1252 for the German locale and 1251 for the Ukrainian locale. I also know that Boost.Locale can convert to a given locale as this small test program works as I expect:
#include <boost/locale.hpp>
#include <iostream>
#include <string>
#include <windows.h>
int main()
{
std::cout << "Codepage: " << GetACP() << std::endl;
std::cout << "Boost.Locale: " << boost::locale::util::get_system_locale() << std::endl;
namespace blc = boost::locale::conv;
// Cyrillic small letter zhe -> \xe6 (ш on 1251, æ on 1252)
std::string const test1251 = blc::from_utf(std::string("\xd0\xb6"), "windows-1251");
std::cout << "1251: " << static_cast<int>(test1251.front()) << std::endl;
// Latin small letter sharp s -> \xdf (Я on 1251, ß on 1252)
auto const test1252 = blc::from_utf(std::string("\xc3\x9f"), "windows-1252");
std::cout << "1252: " << static_cast<int>(test1252.front()) << std::endl;
}
Questions
How can I query the name of the user locale in a format Boost.Locale supports? Using
std::locale("").name()
yields German_Germany.1252, using it results in aboost::locale::conv::invalid_charset_error
exception.Is it possible that the system locale remains de_DE.windows-1252 although I'm supposedly changing it as local admin? Similarly system language is German although my account's language is English. (Log in screen is German until I log in)
should I stick with using short filenames? Does not seem to work reliably though.
Fine-print
- Compiler is MSVC18
- Boost is version 1.56.0, backend supposedly winapi
- System is Win7, system language is German, user language English