2

I am making a input stream from file to a vector of wstrings. There might be russian charcters. If they were there, than after outputting the wstring to the console I am getting the empty line. If the charcters are from English alphabet or from punctuation marks, than every thing is alright. How to fix that?(I use linux)

#include <iostream>
#include <fstream>
#include <vector>
#include <string>

void read_text(std::vector<std::wstring> &words)
{
    std::wifstream input("input.txt");

    if( !(input.is_open()) )
    {
        std::cout << "File was not open!" << std::endl;
        throw -1;
    }

    std::wstring input_string;
    input >> input_string;

    while(input)
    {
        words.push_back(input_string);
        input >> input_string;
    }

    input.close();
}

int main()
{
    setlocale(LC_ALL, "ru_RU.UTF-8");

    std::vector<std::wstring> words;
    try             {   read_text(words);   }
    catch(int i)    {   return i;   }

    for (auto i : words)
    {
        std::wcout << i << std::endl;
    }

    return 0;
}
  • 2
    Is there a reason you're using wstring? Using string, you can use utf-8 encoding to represent any of the characters in the Unicode set - which includes Russian, along with almost every other language we know of (including emojis). – Qix - MONICA WAS MISTREATED Jun 23 '20 at 07:13
  • [Here's the difference between `std::string` and `std::wstring`](https://stackoverflow.com/questions/402283/stdwstring-vs-stdstring). – Rohan Bari Jun 23 '20 at 07:24
  • @Qix-MONICAWASMISTREATED I do that because I have to deal with "multi-character character constant", for example '—' –  Jun 23 '20 at 07:37
  • @Qix-MONICAWASMISTREATED do you know how delete '—' from std::string? For some reason I can not do that by std::string::erase –  Jun 23 '20 at 07:50
  • Because of that I am trying to use std::wstring –  Jun 23 '20 at 07:51
  • 1
    Stick with urf-8 and treat every unicode character as strings: `str.erase(iterator, iterator + strlen("—"));`. Even when using `wchar`, you must be able to handle "multibyte" glyphs. There is nothing gained by using `wchar` – HAL9000 Jun 23 '20 at 09:33

0 Answers0