0

I have a problem where I am attempting to add appropriate articles to the beginning of French country names. That part wasn't an issue but trying to print it out hasn't worked so well. There are special characters in the file and while it seems to read them just fine it won't print them out just fine. I am using visual studios and here are examples of words from the text file named "FrenchCountriesUnfinished.txt" that I am trying to print out

l'Algérie, la Norvège, la Zélande, la Suède, le Zaïre

#include <iostream>
#include <fstream>
#include <string>

using namespace std;

int main()
{
    ifstream file("FrenchCountriesUnfinished.txt");
    string str;

    while (getline(file, str)) {

        cout << str << endl;
    }
    return 0;
}
  • 3
    What's the character encoding of that file? – πάντα ῥεῖ Oct 06 '20 at 23:59
  • 1
    Please provide a [mcve] showing the actual problem in action. Note that handling Unicode characters (not "special" characters) in file I/O is a bit easier (especially if you let C++ streams+facets do the hard work for you) than handling Unicode in the Windows console (which can be tricky - there are *numerous* questions on StackOverflow on that very topic). – Remy Lebeau Oct 07 '20 at 00:04
  • After reading some articles to try to understand what you mean, I think it may be Unix (LF) or ANSI. If that doesn't make sense that means idk what I'm searching for or what that means lol – Cashton Holbert Oct 07 '20 at 00:15
  • 2
    @CashtonHolbert: If the file is a (terrible) ASCII superset like latin-1 or cp1252, and your terminal is (the only good ASCII superset) UTF-8, the bytes will be incompatible. Welcome to the "wonderful" world of text encodings! – ShadowRanger Oct 07 '20 at 00:28
  • Well I can make my own file for this, and since that is the case, what your saying is that I can change the file text encoding or something to align with the terminal text encoding for visual studios? @ShadowRanger – Cashton Holbert Oct 07 '20 at 00:33
  • The problem is that the Windows console doesn't use UTF-8 so special characters don't print right - if you redirect the output to a file it will work fine. – Jerry Jeremiah Oct 07 '20 at 00:36
  • Does this help? https://stackoverflow.com/questions/13391252/how-to-print-latin-characters-to-the-c-console-properly-on-windows one answer suggests `_setmode(_fileno(stdout), _O_U16TEXT);` would work. – Jerry Jeremiah Oct 07 '20 at 00:38
  • @CashtonHolbert: Yes, though that depends heavily on the terminal. `cmd.exe` defaults to [cp437](https://en.wikipedia.org/wiki/Code_page_437) last I checked, and that's a encoding basically no file is encoded in (it's great for line drawing and icons; not so much for expanding the text options). There are C++ libraries (as Remy Lebeau mentioned) and Windows APIs that can convert between encodings, which is likely the only "good" solution. – ShadowRanger Oct 07 '20 at 00:40
  • Thanks, I'll still see what I can do to print it out in console but that sounds wayyyy more sensible @JerryJeremiah – Cashton Holbert Oct 07 '20 at 00:42
  • I definitely think this looks like it would work for you: https://stackoverflow.com/questions/13391252/how-to-print-latin-characters-to-the-c-console-properly-on-windows/13393518#13393518 – Jerry Jeremiah Oct 07 '20 at 00:44
  • UTF-8 support in console depend on Windows version and must be set somehow. See for example: https://stackoverflow.com/questions/64235341/is-there-a-simple-way-to-print-out-special-characters-from-a-file-to-console-in. – Phil1970 Oct 07 '20 at 02:13
  • Have a look at this: https://stackoverflow.com/questions/2492077/output-unicode-strings-in-windows-console-app – Jerry Jeremiah Oct 07 '20 at 03:44

0 Answers0