2

I have a very annoying problem and I'm trying to solve it for lots of hours. I'm using rapidXML with C++ to parse an XML file:

xml_document<> xmlin;
stringstream input; //initialized somewhere else
xmlin.clear();
xmlin.parse<0>(&(input.str()[0]));

cout << "input:" << input.str() << endl << endl;

xml_node<char> *firstnode = xmlin.first_node();
string s_type = firstnode->first_attribute("type")->value();
cout << "type: " << s_type << endl;

However I got this on the stdout:

input:<?xml version="1.0" encoding="utf-8"?><testxml command="testfunction" type="exclusive" />

type: exclusive" /> 

What could be the reason of this (printing the s_type variable)? It's very annoying since I can't process the xml well.

Daniel
  • 2,318
  • 2
  • 22
  • 53

3 Answers3

1

Actually I found the solution.

Stringstream doesn't like when its content is getting modified (rapidXML does a fast in-situ parsing which means it modificates the contents of the array it gets).

However in the docs I read that string class does not like it either.

From the string::c_str documentation page:

the values in this array should not be modified in the program

But when I create a string from the stream it is working as it is expected:

xml_document<> xmlin;
stringstream input; //initialized somewhere else
string buffer = input.str()

xmlin.clear();
xmlin.parse<0>(&(buffer[0]));
Daniel
  • 2,318
  • 2
  • 22
  • 53
  • if rapid xml is modifying the buffer you pass, you should probably use vector rather than string. – markh44 Aug 02 '12 at 09:15
  • It's modifying since it needs to add \0 chars at the end of each attribute and node text, to be able to retrieve these values correctly with ->value() function. However the string class enables to modify its contents, if I'm not mistaken. – Daniel Aug 02 '12 at 09:45
  • See this question: http://stackoverflow.com/questions/1042940/writing-directly-to-stdstring-internal-buffers – markh44 Aug 02 '12 at 11:54
  • The real problem in your original code is that stringstream::str() returns a *copy* of the string - which you effectively take the address of and then discard. So you RapidXML isn't modifying the stringstream buffer (as that's not possible), but instead, some random memory that got subsequently overwritten... Your new code looks correct, though. – Roddy Aug 02 '12 at 11:55
0

I think the problem is in the code you haven't shown... Start by trying this, using a literal string - this works just fine for me...

xml_document<> xmlin;
char *input = "<?xml version=\"1.0\" encoding=\"utf-8\"?><testxml command=\"testfunction\" type=\"exclusive\" />";
xmlin.parse<0>(input);

xml_node<char> *firstnode = xmlin.first_node();
std::string s_type = firstnode->first_attribute("type")->value();
Roddy
  • 66,617
  • 42
  • 165
  • 277
0

I would personally recommend this approach

 xml_document<> doc;
 string string_to_parse;                         
 char* buffer = new char[str_to_parse.size() + 1];  
 strcpy (buffer, str_to_parse.c_str());             

 doc.parse<0>(buffer);                    

 delete [] cstr;  

making a non const char array out of the string you want to parse. I have always found this way safer and more reliable.

I used to do such crazy things as

 string string_to_parse;  
 doc.parse<0>(const_cast<char*>(string_to_parse.c_str()));

and it "worked" for a long time (until the day it didn't when I needed to reuse the original string). Since RapidXML can modify the char array it is parsing and since it is not recommended to change str::string via c_str() I have always used the approach of copying my string to a non const char array and pass that to the parser. It may not be optimal and uses additional memory, but it is reliable and I have never had any errors or problems with it to date. Your data will be parsed and the original string can be reused without fear of it having been modified.

mathematician1975
  • 21,161
  • 6
  • 59
  • 101