2

I would like to make my debug handler (installed with qInstallMsgHandler) handles UTF-8, however it seems it can only be defined as void myMessageOutput(QtMsgType type, const char *msg) and const char* doesn't handle UTF-8 (once displayed, it's just random characters).

Is there some way to define this function as void myMessageOutput(QtMsgType type, QString msg), or maybe some other way to make it work?

This is my current code:

void myMessageOutput(QtMsgType type, const char *msg) {
    QString message = "";

    QString test = QString::fromUtf8(msg);

    // If I break into the debugger here. both "test" and "msg" contain a question mark.

    switch (type) {

        case QtDebugMsg:
        message = QString("[Debug] %1").arg(msg);
        break;

        case QtWarningMsg:
        message = QString("[Warning] %1").arg(msg);
        break;

        case QtCriticalMsg:
        message = QString("[Critical] %1").arg(msg);
        break;

        case QtFatalMsg:
        message = QString("[Fatal] %1").arg(msg);
        abort();

    }

    Application::instance()->debugDialog()->displayMessage(message);
}



Application::Application(int argc, char *argv[]) : QApplication(argc, argv) {
    debugDialog_ = new DebugDialog();
    debugDialog_->show();

    qInstallMsgHandler(myMessageOutput);

    qDebug() << QString::fromUtf8("我");
}
laurent
  • 88,262
  • 77
  • 290
  • 428

4 Answers4

3

If you step through the code in the debugger you will find out that QDebug and qt_message first construct a QString from the const char* and then use toLocal8Bit on this string.

The only way I can think of to circumvent this: Use your own coding (something like "[E68891]") or some other coding like uu-encode or base64-encoding that uses only ASCII characters and decode the string in your message handler.

You should also consider to use the version qDebug("%s", "string") to avoid quotes and additional whitespace (see this question).

Edit: the toLocal8Bit happens in the destructor of QDebug that is call at the end of a qDebug statement (qdebug.h line 85). At least on the Windows platform this calls toLatin1 thus misinterpreting the string. You can prevent this by calling the following lines at the start of your program:

QTextCodec *codec = QTextCodec::codecForName("UTF-8");
QTextCodec::setCodecForLocale(codec);

On some platforms UTF-8 seems to be the default text codec.

Community
  • 1
  • 1
hmuelner
  • 8,093
  • 1
  • 28
  • 39
  • Absolutely brilliant, it finally works! It seems that the default on Windows is indeed Local8Bit instead of UTF-8. Adding these two lines fixed the issue. Thanks so much for your help! – laurent Aug 01 '11 at 11:41
1

try to pass data in UTF8 and extract it in your function with something like

QString::fromUTF8

it takes const char* on input.

Raiv
  • 5,731
  • 1
  • 33
  • 51
  • Thanks, I just tried that but it didn't help. When I check the message string in the debugger, it only shows a question mark. When I convert it with QString::fromUTF8, it's still a question mark. I've appended my code. Any idea what I could be doing wrong? – laurent Jul 25 '11 at 15:55
  • try something like `"\0xE6\0x88\0x91"` instead of `"我"`, if it will be OK then trouble is with encoding. – Raiv Jul 26 '11 at 09:24
  • I just tried that but it didn't work. In that case "msg" only contains a space. I'm not sure what else to try. I'm so surprised they didn't make msg a QString to begin with, to go around all these issues. – laurent Jul 26 '11 at 10:32
  • hmmm. then try something like `QTextCodec::setCodecForCStrings(QTextCodec::codecForName("UTF-8"));` in your main function just after QApplication declaration... – Raiv Jul 26 '11 at 10:42
  • Thanks for all your suggestions but somehow it's still doesn't work. I just tried setting the codec to UTF-8 but the message still comes out as a "?" – laurent Jul 27 '11 at 12:35
  • if it is "?" maybe your font has no symbol to display? try to change font to something that can draw hieroglyph, for example font used in your IDE – Raiv Jul 27 '11 at 13:01
1

The problem is that the operator<<(const char *) method expects a Latin1-encoded string, so you should pass a proper UTF-8 QString to QDebug like this:

qDebug() << QString::fromUtf8("我");

... and from inside the message handler expect a UTF-8 string:

QString message = QString::fromUtf8(msg);

And that should work like a charm. ;)

For more information please read the QDebug reference manual.

You could also do the wrong thing: keep passing UTF-8 encoded strings via << and convert the strings with the horrible QString::fromUtf8(QString::fromUtf8(msg).toAscii().constData()) call.

Edit: This is the final example that works:

#include <QString>
#include <QDebug>
#include <QMessageBox>
#include <QApplication>

void
myMessageOutput(QtMsgType type, const char *msg)
{
    QMessageBox::information(NULL, NULL, QString::fromUtf8(msg), QMessageBox::Ok);
}

int
main(int argc, char *argv[])
{
    QApplication app(argc, argv);

    qInstallMsgHandler(myMessageOutput);
    qDebug() << QString::fromUtf8("我");

    return 0;
}

Please note that QDebug doesn't do any charset conversion if you don't instantiate QApplication. This way you wouldn't need to do anything special to msg from inside the message handler, but I STRONGLY recommend you to instantiate it.

One thing you must be sure is that your source code file is being encoded in UTF-8. To do that you might use a proper tool to check it (file in case you use Linux, for example) or just call QMessageBox::information(NULL, NULL, QString::fromUtf8("我"), QMessageBox::Ok) and see if a proper message appears.

Christophe Weis
  • 2,518
  • 4
  • 28
  • 32
  • Thanks, I did try passing the strings using QString::fromUtf8() but the result is the same (still a "?"). Is there any way to see what actual bytes msg contains? I'm going to update my example to include fromUtf8 – laurent Jul 28 '11 at 05:13
  • Are you instantiating the `QApplication` main object? Are you doing anything special with the charset settings of your application? Please try the simple example I added. It always worked. If it doesn't, please tell me what bytes you're getting from inside the message handler by iterating the string from inside the message handler, like `printf("msg = \""); for (unsigned char *c = (unsigned char *)msg; *c != 0; c++) printf("\\x%02x", (int)*c); printf("\";\n");`. You should get `msg = "\x22\xe6\x88\x91\x22\x20"; `. – Fernando Silveira Jul 28 '11 at 08:59
  • PS: If you're trying to `printf(3)` the input of your message handler you must be sure that your terminal supports UTF-8 properly or else just spaces will be shown. – Fernando Silveira Jul 28 '11 at 09:04
  • 1
    See my answer why this works on your platform but not on Windows. – hmuelner Aug 01 '11 at 09:58
0
#include <QtCore/QCoreApplication>
#include <stdio.h>
#include <QDebug>

void myMessageOutput(QtMsgType type, const char *msg)
{
  fprintf(stderr, "Msg: %s\n", msg);
}

int main(int argc, char *argv[])
{
  qInstallMsgHandler(myMessageOutput);

  QCoreApplication a(argc, argv);

  qDebug() << QString::fromUtf8("我");    
}

The code above works here perfectly, but I must stress that my console does support UTF-8, because if it would not it would show another char at that location.

  • I must really be missing something. I'm using the same code and it keeps outputting "?" in both messages boxes and the console. If I check in the debugger, it's also "?" (while properly encoded characters display correctly). All my files are in UTF-8 and the app can display Chinese characters, but the message handler still cannot. I'm on Windows 7, does it make a difference? (Maybe UTF-8 is properly supported on Mac and Linux but not Windows?) – laurent Jul 30 '11 at 12:08