5

I have tried to find some code for this job in the tutorials and by googling, no luck.

If someone has used PugiXml, could you please help me out ?

My main trouble is Unicode, otherwise the library is very easy to use.

Thanks in advance.

menjaraz
  • 7,551
  • 4
  • 41
  • 81
Wartin
  • 1,965
  • 5
  • 25
  • 40
  • I see that PugiXML currently assumes that all input is UTF-8. Are you having trouble parsing a UTF-8 file or are you trying to use PugiXML with wchar_t/wstring or ...? – ZoogieZork Dec 19 '09 at 02:48
  • Actually I am trying to use a wchar_t[] array to store data. – Wartin Dec 19 '09 at 03:02
  • Looks like the key is to use `pugi::as_utf8()` to pass wchar_t data to PugiXML and `pugi::as_utf16()` to get wchar_t data out. I assume that all char* strings used by PugiXML are UTF-8, but it's not clear from the documentation. – ZoogieZork Dec 19 '09 at 03:24
  • 5
    Just a side note: are you sure that a title starting with "Give me ..." is the best way to introduce a question? I find it rather irratating, but it could just be me – Remo.D Dec 19 '09 at 18:32
  • Might want to try asking questions rather giving commands. We don't work for you. – nont Jan 04 '10 at 20:43
  • @Wartin: I do not know PugiXML but i'm now facing a problem regarding xml parsing in c++... I chose to use RapidXML and then Xerces... I would recommend you (for small app) RapidXML. – Andry Jan 14 '11 at 08:26
  • @Andry: RapidXML doesn't (necessarily) do what he wants either. To get it to parse to UTF-16, you have to provide the file data in UTF-16 format. So he'd have to check the file to see if it's UTF-16. – Nicol Bolas Apr 04 '12 at 23:39

1 Answers1

0

Open pugiconfig.hpp and uncomment PUGIXML_WCHAR_MODE.

Now you can use wchar_t and std::wstring instead of char and std::string respectively.

Quick Start is here: http://pugixml.googlecode.com/svn/tags/latest/docs/quickstart.html

junglecat
  • 643
  • 1
  • 10
  • 19
  • What does this have to do with Unicode? UTF-8 is a perfectly valid Unicode encoding. – Nicol Bolas Apr 04 '12 at 23:37
  • @Nicol Bolas It depends on the platform. On windows you cannot fit a unicode character into type char. It must be wchar_t. http://stackoverflow.com/questions/402283/stdwstring-vs-stdstring – junglecat Apr 04 '12 at 23:39
  • First, Unicode does not have characters; it has code points, code units, and graphemes. Second, UTF-8 **is a perfectly valid Unicode encoding**; what platform you're working on is *irrelevant* to that fact. UTF-8 doesn't stop working just because you're on Windows. Yes, to open a file who's name isn't using ASCII characters, you need to convert it to UTF-16 on Windows. But that's a matter of the API interface, not of the nature of "Unicode". Microsoft does not dictate what "Unicode" means. – Nicol Bolas Apr 04 '12 at 23:41
  • @Nicol Bolas You cannot parse unicode using PUGI on Windows without PUGIXML_WCHAR_MODE. If you try the result will be garbage. – junglecat Apr 04 '12 at 23:43
  • 2
    No, you cannot parse into UTF-16 without that. You will normally get UTF-8, which is *not garbage*. – Nicol Bolas Apr 04 '12 at 23:45