2

I know this question might be asked somewhere around SOF. But having search SOF, I find no matched solution.

I have a mission to create & maintain/populate an extremely large XML tree (approx 2GB on disk). According to the requirements, I have to:

  1. Apply lots of transformation logic to the XML tree's nodes
  2. Create new tree, add transformed nodes to it, then save this new tree to file.

The first thing in my mind when I started to work on it is: I don't have enough memory to hold the file & even if i have enough memory, the performance could be a serious problem if the whole tree is loaded into memory.

With that in mind, I'd used Stream XML fragments technique to read data from original file w/o loading it to memory. But to create new tree, I was stuck. MSDN seems not to have any doc that deal with this problem. Any idea?

Thanks in advance.

RyanB
  • 1,287
  • 1
  • 9
  • 28
  • possible duplicate of [XSLT transformation on Large XML files with C#](http://stackoverflow.com/questions/3101048/xslt-transformation-on-large-xml-files-with-c-sharp) – Larry Nov 27 '14 at 10:07
  • @Larry: The OP isn't necessarily doing an XSLT transform here. If its coded it may be workable with streams and fragments. – Jon Egerton Nov 27 '14 at 10:13
  • @JonEgerton You are right, I missed this. I retract my close vote. – Larry Nov 27 '14 at 10:16
  • If have found this MSDN link that shows streams and yields for large XML files processing : http://msdn.microsoft.com/en-us/library/bb387013.aspx. Hope it gives some pointers. – Larry Nov 27 '14 at 10:26
  • Even my laptop has 16G of RAM, and working in memory is insanely fast. Unless you do a lot of searches in an un-streamable way, in that case XML is the wrong format to begin with, and you should use custom structures like trees. Anyway, if you insist, you can use raw XmlReader and XmlWriter, basically you never create a tree, just read and write nodes one by one. – fejesjoco Nov 27 '14 at 10:28
  • @Larry: I'd read the link you mentioned already. But unfortunately, the transformation logic is quite complex & have several stages. Earlier stage might produce result for next stage to work. In this case, lots of IO will need to be done for the transformation to work this way. – RyanB Nov 27 '14 at 11:04
  • @fejejoco: I thought about custom structure too, but it also be a memory consume approach and for XmlWriter, my transformation logic is not simple enough for this to be useful. I'm thinking about MemoryMappedFile, don't know if it helps in my case. – RyanB Nov 27 '14 at 11:11

0 Answers0