13

I spent a whole day trying to figure this out with no luck. I looked Everywhere but no luck with working code.

OS: Win XP Sp2 IDE & FRAMEWORK: C++, Qt Creator 2.0.

I am trying to output some unicode (UTF-8) text to the windows console but all I see is gibberish in place of the unicode chars. I know the win console does support unicode (since win 2000)... at least according to Wikipedia and many on the net but I don't see how to make it work with Qt. Most "solutions" I've seen (haven't seen many) use C++ and WInAPI tech... which I can't use because that is not the Qt way. I am using QStrings and Qt!

Code is bellow. I took out all the different things I tried to keep the code simple for the post. Hope someone can get the code to work.

#include <QtCore/QCoreApplication>
#include <QString>
#include <QTextStream>          
#include <QDate>
#include <QFile>
using namespace std;

int main(int argc, char *argv[])
{
    QCoreApplication app(argc, argv);

    QTextStream qin(stdin);         
    QTextStream qout(stdout);       

    //The last 2 chars in QString each need a double slash for an accent.
    QString szqLine = QString::fromUtf8("abc áéüóöú őű");

    //I want this text console output to be in red text color.
    qout << "Bellow are some unicode characters: " << endl; 

    //The Win XP console does not display the unicode chars correctly!!
    //The cosole does not display unicode chars even though it is capable
    //according to wikipedia.  I just don't know how with Qt.
    //I want this text output in white(or default font color, not red.)
    qout << szqLine << endl;

    //Would be nice to get some unicode input from console too.
    qout << "Write some unicode chars like above: " << endl;
    QString szqInput;
    szqInput = QString::fromUtf8(qin.readLine());
    qout << "You wrote: " << endl;
    qout << szqInput << endl;



    return app.exec();
}
ismail
  • 46,010
  • 9
  • 86
  • 95
user440297
  • 1,181
  • 4
  • 23
  • 33
  • 2
    Unicode only works if you set a suitable font, generally it doesn't work as it defaults to an ANSI page. See the related questions for similar attempts: http://stackoverflow.com/questions/2849010/output-unicode-to-console-using-c – Steve-o Jan 22 '11 at 05:20
  • ... Apparently you can set things like encoding for console in QT... tried what I could in that regard but all attempts failed. Hope someone knows how to use QT/unicode/console. – user440297 Jan 22 '11 at 05:33
  • 2
    I think you'll need to create a custom QIODevice (or QTextStream) subclass using non-Qt solutions you've found. (Note it's Qt, not QT which is Apple QuickTime.) – Sergei Tachenov Jan 22 '11 at 06:15
  • 1
    This looks szqInput = QString::fromUtf8(qin.readLine()); QTextStream::readLine already returns a string, you implicitely convert it to QByteArray (via toLatin1() most probably) to then read it in again as utf8. Omit the fromUtf8. – Frank Osterfeld Jan 22 '11 at 09:06
  • 1
    @Frank Osterfeld, post it as an answer :) – ismail Jan 22 '11 at 09:41
  • @Frank, even if that is true it does not solve anything. It should read it as unicode UTF16 as Qt claims it supports unicode and default is UTF16 (no?) but I find if you omit fromUTF8 then it causes issues when using the QString down the road. I believe you always have to specify what unicode format to store your string as in QString. Else the QString formatting methods give unpredictable results at best / at worst give you garbage. If you know how to get it to work, post working code :) It would be first working code on the net. – user440297 Jan 22 '11 at 15:34
  • @Sergey, I hope that is not the case. I'd have to use the Win32API... making my code non-portable. XP SP2 is the most widely used OS by far... there should be a way already in Qt to set the codepage and whatnot to what it needs to be. I've played with that already with no luck. Hopefully someone knows enough to post working code. – user440297 Jan 22 '11 at 15:42
  • @user440297: That's absolutely true when converting from bytearrays to strings. But not when operating on strings. If you call fromUtf8(someQString), it converts your string to bytearray and then converts it back to QString. That's bound to break things. I suggest: Define QT_NO_CAST_FROM_ASCII and QT_NO_CAST_TO_ASCII for your project to avoid all implicit conversions, and for the QTextStreams, disable autoDetectUnicode and set the codec you want to write (output) resp. expect (input) explicitely using setCodec. – Frank Osterfeld Jan 22 '11 at 18:22
  • 1
    It looks like it's not an easy task and you need Win API functions to do this. You might want to take a look at SO question [Output unicode strings in Windows console app](http://stackoverflow.com/questions/2492077/output-unicode-strings-in-windows-console-app) – Piotr Dobrogost Jan 22 '11 at 18:35
  • @user440297, you'll need some #ifdefs anyway to make your code portable because 1) Qt doesn't have an API to output Unicode to the console, and 2) even if you use the local encoding instead of Unicode, on Windows it is different from the default system encoding (e. g. for Russian IBM866 for the console, WINDOWS-1251 for the rest). So if you absolutely need Unicode, you'll need Win32 API, if you just want to output localized text, you'll need to set the output encoding properly. That is, unless you'll somehow make the Windows console to accept Unicode input directly. – Sergei Tachenov Jan 23 '11 at 07:10

4 Answers4

9

Okay, I did some testing with this code. No special setup for the console is required.

#include <QTextStream>

#ifdef Q_OS_WIN32
#include <windows.h>
#include <iostream>
#else
#include <locale.h>
#endif

class ConsoleTextStream: public QTextStream {
  public:
    ConsoleTextStream();
    ConsoleTextStream& operator<<(const QString &string);
};

ConsoleTextStream::ConsoleTextStream():
  QTextStream(stdout, QIODevice::WriteOnly)
{
}

ConsoleTextStream& ConsoleTextStream::operator<<(const QString &string)
{
#ifdef Q_OS_WIN32
  WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE),
      string.utf16(), string.size(), NULL, NULL);
#else
  QTextStream::operator<<(string);
#endif
  return *this;
}

class ConsoleInput: public QTextStream {
public:
  ConsoleInput();
  QString readLine();
};

ConsoleInput::ConsoleInput():
  QTextStream(stdin, QIODevice::ReadOnly)
{
}

QString ConsoleInput::readLine()
{
#ifdef Q_OS_WIN32
  const int bufsize = 512;
  wchar_t buf[bufsize];
  DWORD read;
  QString res;
  do {
    ReadConsoleW(GetStdHandle(STD_INPUT_HANDLE),
        buf, bufsize, &read, NULL);
    res += QString::fromWCharArray(buf, read);
  } while (read > 0 && res[res.length() - 1] != '\n');
  // could just do res.truncate(res.length() - 2), but better be safe
  while (res.length() > 0 
         && (res[res.length() - 1] == '\r' || res[res.length() - 1] == '\n'))
    res.truncate(res.length() - 1);
  return res;
#else
  return QTextStream::readLine();
#endif
}

int main()
{
#ifndef Q_OS_WIN32
  setlocale(LC_ALL, "");
#endif
  ConsoleTextStream qout;
  qout << QString::fromUtf8("Текст на иврите: לחם גרוזיני מסורתי הנאפה בתנור לבנים\n");
  qout << QString::fromUtf8("Текст на японском: ※当サイト内コンテンツ・画像・写真データの、転載・転用・加工・無断複製は禁止いたします。\n");
  qout << QString::fromUtf8("Текст на европейском: áéüóöú őű\n");
  qout << flush; // needed on Linux
  ConsoleInput qin;
  QString s = qin.readLine();
  qout << s << endl;
  s = qin.readLine(); // one more time, to ensure we read everything ok
  qout << s << endl;
  return 0;
}

On Windows it prints square boxes for all text except Russian and European. It looks like Lucida Console doesn't have support for Hebrew and Japanese. The funny thing is, when I copy the text from the console to the clipboard and then paste somewhere with Unicode support (e. g. in a browser), it does show up correctly. This proves that Windows actually outputs Unicode, just doesn't display it. Some console font with full Unicode support is needed.

Note that in the example above I have overridden only one operator<<(), but I would need to override them all if I wanted to use them, because they return QTextStream& but aren't virtual, so it is necessary to make them all return ConsoleTextStream&, otherwise something like qout << 1 << someUnicodeString won't work correctly.

I also tested this example on Linux with UTF-8 locale, works perfectly.

Console input with ReadConsoleW() works because the console is configured in so-called line input mode by default, so it waits until the user hits Enter but doesn't wait until enough characters available to fill the buffer, so it does exactly what we want: reads a line provided that the buffer size is enough.

Sergei Tachenov
  • 24,345
  • 8
  • 57
  • 73
4

You're making mistakes in both phases - input and output.

Input

You can't write
QString szqLine = QString::fromUtf8("abc áéüóöú őű");
and hope to have a valid Unicode string as the result because this is not guaranteed by the C++ Standard (see SO question C++ source in unicode for details).

You can check you don't have a valid Unicode string using code like this

foreach(QChar ch, szqLine) {
  qout << ch.unicode();
}

If szqLine were a valid Unicode string you would get a list of Unicode code points of characters in the string. In case of your string you get no output.

The proper way to do it is like this

QChar const chars[] = { 'a', 'b', 'c', ' ', 255, 233, 252, 243, 246, 250, ' ', 337, 369};
QString s(&chars[0], sizeof(chars)/sizeof(QChar));

See QString::QString ( const QChar * unicode, int size ), QChar::QChar ( int code ) Qt functions and Full UTF-8 Character Map for Unicode code points of your characters.

Output

Windows console uses one specific code page for input and another one for output (see Console Code Pages) when you use standard input/output mechanisms. This constraints the set of characters you can enter and display to these present in the current code page. However you can use WriteConsole Win API function to output any Unicode string encoded in UTF-16. There's no way you can avoid using Win API function here because there's no Qt API that could be used here. Below is complete example showing how to display characters from your question on the Windows console.

#include <QtCore/QCoreApplication>
#include <QString>
#include <QTextCodec>

#include <Windows.h>

using namespace std;

int main(int argc, char *argv[])
{
    QCoreApplication app(argc, argv);

    QChar const chars[] = { 'a', 'b', 'c', ' ', 255, 233, 252, 243, 246, 250, ' ', 337, 369};                
    QString s(&chars[0], sizeof(chars)/sizeof(QChar));

    WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), s.utf16().constData(), s.size(), NULL, NULL);

    return app.exec();
}
Community
  • 1
  • 1
Piotr Dobrogost
  • 41,292
  • 40
  • 236
  • 366
  • Thanks Piotr, I'm not sure that you're correct about the invalidity of QString szqLine = QString::fromUtf8("abc áéüóöú őű"); In a previous experiment I did a simillar test to what you propose and [CODE SAMPLE] foreach (QChar cqCharacter, szqLine22){ out << cqCharacter << endl; } [/CODE SAMPLE] worked fine. It put each QChar perfectly out to a file on its own line. Also all QString methods like .length() etc give correct result. – user440297 Jan 22 '11 at 16:52
2

I think it's because your code needs to use WriteConsoleW instead of WriteFile internally, and the runtime library might not use that function. If it doesn't use WriteFileW, then you can't write Unicode.

user585359
  • 39
  • 2
0

I got a little farther today with the code bellow. Now it displays unicode correctly to the console but still not working quite right because the console freezes/crashes after the first unicode text is put out to the console and nothing subsequent to that is show on the console. It is as if unicode chars cause the console buffer to get confused after the first text output.

#include <QtCore/QCoreApplication>
#include <QString>
#include <QTextStream>
using namespace std;

int main(int argc, char *argv[])
{
    QCoreApplication app(argc, argv);

    QTextStream qin(stdin);
    qin.setCodec("UTF-8");
    qin.autoDetectUnicode();

    QTextStream qout(stdout);
    qout.setCodec("UTF-8");
    qout.autoDetectUnicode();

    //The last 2 chars in QString each need a double slash for an accent.
    QString szqLine = QString::fromUtf8("abc áéüóöú őű");

    qout << "Bellow are some unicode characters: " << endl;

    //Now this is displayed correctly on cosole but after displaying text
    //it no loger is capable of displaying anything subsequently.
    qout << szqLine << endl;

    //Would be nice to get some unicode input from console too.
    qout << "Write some unicode chars like above: " << endl;
    QString szqInput;
    szqInput = qin.readLine();
    qout << "You wrote: " << endl;
    qout << szqInput << endl;

    return app.exec();
}
user440297
  • 1,181
  • 4
  • 23
  • 33
  • @user440297 Because I do not use UTF-8 encoding for my source files I changed your code to use my method of creating QString. When I run this code I get right characters printed but in addition I get some garbage after them and the prompt `Write some unicode chars like above:` at the bottom. The program does not crash and console does not freeze; I can input characters and after hitting enter they are displayed. – Piotr Dobrogost Jan 22 '11 at 19:44
  • @Piotr, I tried you hard coded QString initialization method using ASCII unicode code points. Without WriteConsoleW there is no benefit...exactly same efect. The console displays the first unicode output correctly then freezes. I think it does not freeze for u because u are using the WinAPI to actually send the string to console. I have a feeling it has to do with the BOM not being correct when using QTextStream instead of WriteConsoleW(WinAPI) to output to stdout. Not sure how to fix that. I don't wan to use WinAPI because that excludes Linux and MAC. – user440297 Jan 22 '11 at 19:59
  • @user440297 As to excluding Linux and Mac; it's not like this. You have to use what works for each system using #IFDEFs. – Piotr Dobrogost Jan 22 '11 at 20:14
  • @Piotr, I tried what you're suggesting and it made no difference. If you don't mind post the code you are using. There must be some misunderstanding between us. Your point about the conditional implementation possibility is well taken. – user440297 Jan 22 '11 at 20:20
  • @Piotr, thanks for the code. It makes no difference. It freezes the exact same way. I tried with the src file encoded as UTF-8 and also as SystemDefault(Which is prob what you have your src set at in Qt Creator). No difference at all. – user440297 Jan 22 '11 at 20:39
  • @user440297 I'm running Vista x64. As you can see your code behaves differently on two different platforms and it does not work on either of them. That's expected because this code is not correct. – Piotr Dobrogost Jan 22 '11 at 20:50
  • @Piotr, that may account for the difference we are experiencing. I agree that my code is not correct but the question is, what is wrong. I believe that the issue is with how QTextStream behaves and not how the QString is initialized since your initilization method did not fix the issue... it is logical to assume that the problem is with how the QString is dumped into the Windows Console...and that the issue is worse on the 10 year old XP OS...though not much better in newer Windows OS's. – user440297 Jan 22 '11 at 20:56
  • @user440297 "it is logical to assume that the problem is with how the QString is dumped into the Windows Console" Yes, that's the problem. Creation of QString with right characters is already handled. – Piotr Dobrogost Jan 22 '11 at 22:03
  • @Piotr, your WriteConsoleW method works for outputing text to user. Could you provide an example of an equivalent WinAPI solution to reading a line of unicode input from a user at the console into a QString? That would help a lot. Perhaps a ReadConsoleW or something like that. Not sure what the reliable way to do this would be. I am new to WinAPI. Thanks. – user440297 Jan 22 '11 at 23:12
  • @user440297 Take a look at [ReadConsole Function](http://msdn.microsoft.com/en-us/library/ms684958%28v=vs.85%29.aspx) – Piotr Dobrogost Jan 23 '11 at 00:23
  • Not easy... is it? :) Need to specify num chars to read. So it does not read until ENTER is hit. Too bad. – user440297 Jan 23 '11 at 01:11
  • @user440297, in fact it does, and [the docs say that, only in different place](http://msdn.microsoft.com/en-us/library/ms683457%28v=vs.85%29.aspx). See the edit to my answer, it works. – Sergei Tachenov Jan 24 '11 at 06:49