0

I have an c++ program that should read a text from a file named "Pontaje.txt" and store it into a string. The problem is that in that file is an special character and the program can't use it properly.

Pontaje.txt

[604] Dumy | 17501 — Today at 12:01 AM

Note that "—"(is a special character) is not "-"(from the keyboard)

main.cpp

#include <iostream>
#include <string>
#include <fstream>

int main()
{
    std::ifstream test("Pontaje.txt");
    std::string test2;
    std::getline(test, test2);
    std::cout << test2;
}

Output:

[604] Dumy | 17501 — Today at 12:01 AM

How can I assign in test2 that line properly?

Note that I've tried using std::wifstream , std::wstring and std::wcout but i get the same result

Robisca
  • 3
  • 1
  • See https://stackoverflow.com/questions/4775437/read-unicode-utf-8-file-into-wstring –  Nov 11 '22 at 14:21
  • 1
    What happens if you just run: `std::cout << "[604] Dumy | 17501 — Today at 12:01 AM"`? Does it print correctly then? If not, then it is likely to be a problem with your console (what happens if you paste the line there?) or it might be that you are using a old compiler. – Frodyne Nov 11 '22 at 14:31
  • Almost, ``` [604] Dumy | 17501 ù Today at 12:01 AM ``` I don t think the compiler is the problem because I use Visual Studio with the latest updates – Robisca Nov 11 '22 at 14:40
  • Have you tried executing it from `Tools > Command Line > Developer Command Prompt`? – rturrado Nov 11 '22 at 14:42

1 Answers1

1

ΓÇö is a tell-tale sign: it's three characters instead of one. That happens when the input is encoded using UTF-8 (a multi-byte character set) but the output is done with a single-byte character set. I can't directly eyeball what character set contains all of ΓÇö, though.

IOW the problem is not the input or the output, but the assumption that they're using the same character set. And std::string makes no assumptions at all about character sets.

MSalters
  • 173,980
  • 10
  • 155
  • 350