0

I've a huge XML file, which has a list of software names and it's versions. One of it is having a non-English character ó as in the below snippet. If i open the XML file with the browser it doesn't display anything. But if i just remove this non-English character ó , the entire XML is displayed.

<Item Software="SDK de comprobación de Visual Studio 2012 - esn" Version= "12.0.30501" />

This clearly means that the non-english character ó is causing this to happen.

This is how my XML file looks like:

<?xml version="1.0" encoding="UTF-8"?>
<Softwares>
<Item Software="SDK de comprobación de Visual Studio 2012 - esn" Version= "12.0.30501" />
<Item Software="Notepad++" Version= "72.0.45" />
<Item Software="MyApp" Version= "7.8.45" />
..................................
</Softwares>

Does it have to do something with the encoding? I get the same result even with no encoding mentioned which i think defaults to utf-8 again. Also i tried giving UTF-16 as format which also doesn't work. I'm pretty new to XML.

James Z
  • 12,209
  • 10
  • 24
  • 44
codeLover
  • 3,720
  • 10
  • 65
  • 121
  • 1
    Are you sure your file is encoded in UTF-8? It's valid XML otherwise. – Mark Tolonen Oct 21 '17 at 17:37
  • is the exact line i used in the begining of XML file. Is it correct. – codeLover Oct 22 '17 at 10:13
  • That line doesn't control the encoding, it only declares it. You have to save the file in that encoding as well. – Mark Tolonen Oct 22 '17 at 16:14
  • how can i save XML file as utf8 programatically in C++ using std::fstream ?. – codeLover Oct 22 '17 at 16:41
  • That is another question, but it has been asked many times on SO. https://stackoverflow.com/questions/4018384/stl-and-utf-8-file-input-output-how-to-do-it is one. – Mark Tolonen Oct 22 '17 at 17:19
  • XML has no concept of English. And, ó is occasionally [used in English](https://en.wiktionary.org/wiki/Category:English_terms_spelled_with_%C3%93). It's a symptom of saying the file is UTF-8 when it's not. So, you are lucky to a nice indicator of that. One way of getting it right is to use an XML library that will match the declared encoding to the file encoding upon save. – Tom Blodget Oct 23 '17 at 00:26

1 Answers1

0

There is nothing wrong with the XML you've posted, including the Unicode character, LATIN SMALL LETTER O WITH ACUTE, ó.

kjhughes
  • 106,133
  • 27
  • 181
  • 240