0

I was using the std::fstream library, and I found out that it was failing to write. Turns out, it was an mdash.

wchar_t mdash[] = { 0x2014, 0x0000 };

std::wfstream os("filename.txt", std::ios_base::out| std::ios_base::trunc);
os << mdash;
assert(!os.bad()); // fails

I'm not in control of what stuff I'm going to dump to the file, so I needed a way to properly write out the file without crapping out. So I wrote this function based on this answer.

void set_locale_on_stream(std::wfstream &os)
{
    char* locale = setlocale(LC_ALL, "English"); // Get the CRT's current locale.
    std::locale lollocale(locale);
    setlocale(LC_ALL, locale); // Restore the CRT.
    os.imbue(lollocale); // Now set the std::wcout to have the locale that we got from the CRT.
}

This worked, except now I've got my numbers getting grouping separators added to them, and they are hex, making it totally useless!

Is there a way to stop that from happening?

Adrian
  • 10,246
  • 4
  • 44
  • 110
  • You can `setlocale(LC_ALL, nullptr)` to just query the current locale without changing it. – aschepler Nov 16 '18 at 04:40
  • @aschepler, what will that give me? Am I to constantly switch between locales each time I output a number? – Adrian Nov 16 '18 at 05:12
  • This doesn't work. You start with `fstream` but `set_locale_on_stream` uses `wfstream` Your code seems to writes a memory address if I duplicated this correctly. What is your expected output and what do you see? – Barmak Shemirani Nov 16 '18 at 05:44
  • @BarmakShemirani, yeah, that was a typo. Fixed – Adrian Nov 16 '18 at 06:07
  • @Adrian I just mean the function you have could be simpler. It's not a solution. – aschepler Nov 16 '18 at 13:44

1 Answers1

1

fs.imbue(std::locale(ofs.getloc(), new std::codecvt_utf16<wchar_t, 0x10ffff, std::little_endian>)) is needed to set the locale, unfortunately codecvt_utf16 is deprecated and has no replacement as of yet.

Instead, you can open the file in binary mode and use pubsetbuf. Note, if your file does not have a 2-byte BOM marker at the start then the text editor may not recognize it as UTF16-LE.

int foo()
{
    wchar_t mdash[] = L"—  Test";
    const wchar_t *filename = L"filename.txt";

    wchar_t wbuf[128];
    std::wofstream fout(filename, std::ios::binary);
    if(fout)
    {
        fout.rdbuf()->pubsetbuf(wbuf, 128);

        //optional BOM
        wchar_t bom[1] = { 0xFEFF };
        fout.write(bom, 1);

        fout << mdash;
        fout.close();
    }

    std::wifstream fin(filename, std::ios::binary);
    if(fin)
    {
        fin.rdbuf()->pubsetbuf(wbuf, 128);

        //optional, skip BOM
        std::wstring wstr;
        if(fin >> wstr)
            MessageBoxW(0, wstr.c_str(), 0, 0);
        fin.close();
    }
    return 0;
}
Barmak Shemirani
  • 30,904
  • 6
  • 40
  • 77
  • I'm fine with UTF-8 which is what I'm getting now. – Adrian Nov 16 '18 at 06:17
  • Using extended chars in a c++ (at least in a VC++) environment can cause issues IIRC. – Adrian Nov 16 '18 at 06:19
  • For UTF8 just convert `mdash` to UTF8 and use `std::fstream` as if it's using ASCII. It won't need any additional changes. Visual Studio's implementation for `fstream` can also accept UTF16 for filename so nothing gets broken. – Barmak Shemirani Nov 16 '18 at 06:20
  • Oh yeah, forgot about the [ATL and MFC String Conversion Macros](https://msdn.microsoft.com/en-us/library/87zae4a3.aspx). I don't really like that you are writing a text file as binary though. – Adrian Nov 16 '18 at 06:30