-1

I need you help.

I'm using Windows 10 and Visual Studio Community compiler.

I managed to get Lithuanian letter to show on C++ console application using wstring and wcout.

#include <iostream>
#include <io.h>
#include <fcntl.h>

using namespace std;
int main()
{
   _setmode(_fileno(stdout), _O_U16TEXT);
   wstring a = L"ąėėąčėį";
   wcout << a;

   return 0;
}

Result is exactly what I wanted it to be

enter image description here

Now I want my program to read Lithuanian letters from Info.txt file.

enter image description here

This is how far I managed to get.

#include <iostream>
#include <fstream>
#include <io.h>
#include <fcntl.h>
#include <string>

using namespace std;
int main()
{
   _setmode(_fileno(stdout), _O_U16TEXT);
   wstring text;
   wifstream fin("Info.txt");
   getline(fin, text);
   wcout << text;

   return 0;
}

Returned string in console application shows different simbols. enter image description here

But the returned string in console application shows different simbols.

In my belief a possible solution

I need to add L before the text like in previous example with wcout.

wstring a = L"ąėėąčėį";

But I'm still just learning C++ and I don't know how to do so in example with Info.txt

I need your help!

Liutauras
  • 11
  • 2

1 Answers1

0

UTF8 needs std::ifstream, not wifstream. The latter is used in Windows as UTF16 file storage (not recommended in any system)

You can use SetConsoleOutputCP(CP_UTF8) to enable UTF8 printing, but that can run in to problems, specially in C++ 20

Instead, call _setmode and convert UTF8 to UTF16.

Make sure notepad saves the file in UTF8 (encoding option is available in Save window)

#include <iostream>
#include <fstream>
#include <string>
#include <io.h>
#include <fcntl.h>
#include <Windows.h>

std::wstring u16(const std::string u8)
{
    if (u8.empty()) return std::wstring();
    int size = MultiByteToWideChar(CP_UTF8, 0, u8.c_str(), -1, 0, 0);
    std::wstring u16(size, 0);
    MultiByteToWideChar(CP_UTF8, 0, u8.c_str(), -1, u16.data(), size);
    return u16;
}

int main()
{
    (void)_setmode(_fileno(stdout), _O_U16TEXT);
    std::string text;
    std::ifstream fin("Info.txt");
    if (fin)
        while (getline(fin, text))
            std::wcout << u16(text) << "\n";
    return 0;
}
Barmak Shemirani
  • 30,904
  • 6
  • 40
  • 77
  • "*UTF8 needs `std::ifstream`*" - only if you are going to read the UTF-8 bytes yourself, such as into a `std::string` or `std::u8string`. Otherwise, you can use `std::wifstream` to read data as `std::wstring` if you `imbue()` a UTF-8 locale into the stream so it can decode the UTF-8 data into `wchar_t` data. – Remy Lebeau Oct 21 '21 at 19:43
  • @RemyLebeau Can you post an answer, or link, which shows how to do that? Or show an example where `std::u8string` is useful for anything at all in C++ 20 with Windows/MSVC. – Barmak Shemirani Oct 21 '21 at 20:36
  • The linked duplicate answer already shows how to `imbue()` a UTF-8 locale. – Remy Lebeau Oct 21 '21 at 20:39
  • @RemyLebeau Are you referring to deprecated `codecvt_utf8`? Visual Studio will issue a warning and suggests using `MultiByteToWideChar` – Barmak Shemirani Oct 21 '21 at 20:44
  • 1
    Did you see the answer that uses `std::locale("zh_CN.UTF-8")` instead? Otherwise, even though `std::codecvt_utf8` is deprecated, it still works. Or, you can always just implement your own `locale` class for handling UTF-8. – Remy Lebeau Oct 21 '21 at 20:54
  • @RemyLebeau, I see, that works better. – Barmak Shemirani Oct 21 '21 at 21:39