Right now I am working on reading a big .xml file (about 1GB) then extract the information from nodes and assign them to fields of an object of my class.
You can assume that the XML file is contain a huge bunch of information of workers, covering ID, location, gender and so on. The information of each of the workers would be various, which means that one worker would just have ID and location, while another would just have ID and gender, like the following:
<workers>
<row Id="1" Location="Bos" Gender="M" />
<row Id="2" Gender="F" />
<row Id="3" Location="Cal" />
....
My silly way is trying to use ifstream then using function getline()
, and then extract the information to the string fields of object one by one, then save the object to a container. But it will work under using about 1 GB memory.
I tried to use boost to read XML file before, but when I used the way worker.Gender = child.second.get<string>(<xmlattr>.Gender);
, it could not work for each node because some worker did not have the info of gender, that is, this way would return error when there is no info about Gender
node of the worker.
So my question would be, how to have a good way to extract the info from this XML file with a low usage of memory? Is it possible to be reduced to 100 MB? And how, please? Why the memory would not be deleted when after function getline()
on the next line of text?