0

I'm creating a C++ program that uses RapidXML to read data from xml files. Because RapidXML allocates pointers to the xml content to be used to access it, it needs the content of the file to live on as long as it is parsed in further steps.

So I had several files parsed successfully, but at the beginning of every parsing of a file, I always had all of the ifstream and allocating going on. I came up with the idea to create a function that would take in the file path, pointers to the xml_document<> instance and so on and return the root node of the xml file. What I then realised was, that I might hit scope problems with the dynamically allocated char array containing the xml content to be parsed and later pointed to.

I then tried the following technique, where I allocated a char*, xml_document and xml_node and invoked the function to retrieve the root node.

The function:

bool readXML(const char * path, char * buffer, xml_document<> * doc, const char * rootNodeName, xml_node<> * root_node){
// Open file
    ifstream file(path, ios::in|ios::binary);
    if (!file.is_open()){   err(ERR_FILE_NOTFOUND,path);    return 0;   }
// Get length
    file.seekg(0,file.end);
    int length = file.tellg();
    file.seekg(0,file.beg);
// Allocate buffer
    buffer = new char [length+1];
    file.read(buffer,length);
    buffer[length] = '\0';
    file.close();
// Parse
    doc->parse<0>(buffer);

// Get root node
    root_node = doc->first_node(rootNodeName);
    if ( !root_node ){  err(ERR_FILE_INVALID,path); return 0; }
    return 1;
}

The code where I use the function (reading "Hersteller.xml" / initializing class):

bool loadHersteller(){ // v4
    // Declare
    char * bfr;
    xml_document<> doc;
    xml_node<> * rt_node;

    // Get root node
    if (!readXML(concatURL(PFAD_DATEN,"Hersteller.xml"), bfr, &doc, "Hersteller", rt_node)) return 0;

    // Extract
    if (!initHRST(rt_node)) return 0; // Works fine on it's own (initializes a class)
    toConsoleHrst(); // Works fine on it's own (prints data back to console)

    // Clean up
    delete[] bfr;
    doc.clear();
    return 1;
} // END loadHersteller()

Now what I get from that is a blank console and a crash with it, returning an interger. I am very certain that the problem is the scope or lifetime of the char array. The goal of this function is to do all the work of retrieving xml files / allocating the buffer for me and passing me the root node (which I'll right pass to another function). As said, the char array needs to stay alive in the scope where the function was invoked from, so it can be accessed via the pointer structure built by the parser.

Sam
  • 251
  • 1
  • 19
  • 1
    Try passing `buffer` as `char*&`. – dlf Jun 10 '14 at 19:18
  • Same thing for `root_node`. – dlf Jun 10 '14 at 19:20
  • Now it works! I changed the '*' in the prototype to '*&'. Is that all I needed to do to solve it properly? I'd much appreciate a precise description of how this was the problem in this case. Thanks dlf! – Sam Jun 10 '14 at 19:24
  • 1) Yes, you need to pass a reference to your buffer pointer, but why do you need a `bfr` parameter to begin with? Everything is local within that readXML function. 2) Your code has an issue if that readXML function returns and didn't allocate a buffer. You call `delete[]` on an uninitialized pointer. – PaulMcKenzie Jun 10 '14 at 19:25
  • @Sam - When you pass a parameter by value to a function, any changes to that parameter are only valid within the function. It doesn't matter if that parameter is an int, double, float, *a pointer*, a Widget, etc. To change the value and have it reflect back to the caller, you either pass a reference to that value, or a pointer to that value. So in this case, you pass a reference to the pointer, or a pointer to the pointer. – PaulMcKenzie Jun 10 '14 at 19:28
  • The thing is that RapidXML lets you access the xml nodes via pointers, so the base content of the file needs to stay alive for that. So I thought to myself if I declare the buffer at the place where the function goes to work, I'll be able to access the buffer from there through other functions too. – Sam Jun 10 '14 at 19:29

2 Answers2

1

To fix this, pass both your out parameters (buffer and root_node) as char*& (reference to char pointer) rather than simply char*. Otherwise, what readXML() receives is a copy of the two pointers, and whatever values you assign to those copies are lost when the function returns and they are destroyed.

Note: There is a potential memory leak in the code because the delete[] instruction won't be reached if either readXml() or initHRST() fails and the function returns early.

dlf
  • 9,045
  • 4
  • 32
  • 58
-1

Instead of char *, declare function parameters as char **. Inside the function, prefix the parameter name with *. For example, *buffer = new char[length + 1].

When passing a variable to the function, prefix it with &:

if (!readXML(...), &bfr, ...

gcvt
  • 1,482
  • 13
  • 15
  • 1
    2 downvotes? `**` may require a bit more mental gymnastics than `*&`, but it's still correct... – dlf Jun 10 '14 at 19:55