2

I am trying to store the contents of a QDomDocument to a file. The document contains a German umlaut, which doesn't get saved to the file correctly.

My QDomDocument "document" is structured like this:

<parent>
    <child attribute="äüö"/>
</parent>

I save it to an XML file like this:

QString string = document.toString();
QFile file("/path/to/my/file.xml");
file.open(QIODevice::WriteOnly | QIODevice::Text)
QTextStream txtStream(&file);
txtStream<< string;
file.close();

qDebugging the string at that point reveals that the umlauts are still intact. But when writing them to a file, my XML file looks like this:

<parent>
    <child attribute="הצ"/>
</parent>

I tried various possibilities like converting the QString to a different encoding, or setting the stream codec to a different value, but the best I could get was this:

<parent>
    <child attribute="ֳ₪ֳ¼ֳ¶"/>
</parent>

which is even worse.

Please help.

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
Don Joe
  • 274
  • 4
  • 13
  • Try instead a `QByteArray` for encoding .. – Mohammad Kanan Feb 06 '18 at 09:46
  • 1
    What if you set the codec to your stream like: `txtStream.setCodec("UTF-8");`? – vahancho Feb 06 '18 at 09:46
  • @vahancho: Setting the stream codec to UTF-8 leads to these symbols: ₪ֳ¼ֳ¶ – Don Joe Feb 06 '18 at 09:56
  • @DonJoe, hm, I cannot reproduce the problem with Qt 5.6.2. How do you set the content of your `QDomDocument`? And how do you open the file? – vahancho Feb 06 '18 at 10:08
  • 1
    And how do you confirm the contents of the file? I suggest using `od` or similar. Can you reproduce this behaviour with plain `QString` reading and writing (without `QDomDocument`)? That would help you create a [mcve] that's completely relevant. – Toby Speight Feb 06 '18 at 10:11
  • What if you change to QByteArray string = document.toByteArray() ? – talamaki Feb 06 '18 at 10:13
  • The problem appears only in output xml file .. so it has to do with how the file is encoded – Mohammad Kanan Feb 06 '18 at 10:16
  • I'm on Qt 5.5.1. The problem appears in the output file, yes. I solved my specific problem by setting the encoding of both the output- and input-stream to UTF-8. The contents of the file are still jumbled, but the weird characters become umlauts again when read in Qt. But someone who really needs the contents in the XML file to be correct might still have this problem. – Don Joe Feb 06 '18 at 13:02
  • @talamaki: That worked! Thanks! – Don Joe Feb 06 '18 at 13:10
  • By output and input stream I mean that after writing the XML file, I read it again at a later point. I realized that this might be unclear because I didn't mention it. – Don Joe Feb 06 '18 at 13:16
  • _someone who really needs the contents in the XML file to be correct might still have this problem_: encode xml as `` when exporting https://community.microfocus.com/microfocus/cobol/net_express__server_express/w/knowledge_base/6598/xml-parse-containing-german-umlauts-or-french-accented-characters – Mohammad Kanan Feb 06 '18 at 13:17

1 Answers1

1

Changing the QString to a QByteArray by using document.toByteArray() worked.

Thanks @talamaki!

Don Joe
  • 274
  • 4
  • 13