I'm trying to make a simple program to enumerate the files on my disks, but I'm stuck at the great UTF frontier.
I'm using boost::recursive_directory_iterator to enumerate the files. That's works great, but Windows is set to "french canada" and many files and directories have french characters (likes é, è, ç). These filenames are not displayed correctly on the screen and I'm using wcout. I see a '▒' instead of the acute chars. Even boost::filesystem::ifstream is unable to open these files.
I tried to add "std::locale::global(std::locale(""))", but at first that only thrown an exception. I have found that when LANG is set to "" while executing the program, the previous command does not throw any more, but it only set the "C" locale instead of being the one use by the OS (which I expect to be "fr_CA.UTF-8" or "fr_CA.ISO8859-1"). Any other value for LANG bring the exception back...
What must be done to have a cygwin gcc program usable in an i18n world?
I have write this to test various locale ID:
#include <iomanip>
#include <iostream>
#include <locale>
using namespace std;
void tryLocale(string ID)
{
try{
cout << "Trying " << std::setw(18) << std::left << "\"" + ID + "\" ";
std::locale Loc(ID.c_str());
cout << "OK (" << Loc.name() << ")" << endl;
}catch(...){
cout << "FAIL" << endl;
}
}
const char *Locales[] = { "", "fr", "fr_CA", "fr_CA.UTF-8", "fr_CA.ISO8859-1", "C", 0};
int main()
{
cout << "Classic = " << std::locale::classic().name() << endl << endl;
int i = 0;
do
{ tryLocale(Locales[i]);
} while(Locales[++i]);
return 0;
}
And that gives me this output (without any LANG or LC_ALL):
Classic = C
Trying "" FAIL
Trying "fr" FAIL
Trying "fr_CA" FAIL
Trying "fr_CA.UTF-8" FAIL
Trying "fr_CA.ISO8859-1" FAIL
With LANG set to "", the first "trying" becomes
Trying "" OK (C)
The exception thrown print this:
terminate called after throwing an instance of 'std::runtime_error'
what(): locale::facet::_S_create_c_locale name not valid