4
Dim docu As New XmlDocument()
docu.load("C:\bigfile.xml")

Dim tempNode As XmlNode
tempNode = docu.SelectSingleNode("/header/type")

someArray(someIndex) = tempNode.innerText

...do something more...

I am using XmlDocument() to load a huge XML document (100~300MB)

When I open the document and read it as string, my application uses about 900MB of RAM. I wonder why it happens and how can I prevent it ?

Note that : even, the XmlDocument does not have Dispose() to remove allocated things. Although I need the whole string of the huge XML file in later part of the app, the /header/type.innerText is only a single word


More of source :

Private Sub setInfo(ByVal notePath As String)

        Dim _NOTE As XDocument

        _NOTE = XDocument.Load(notePath)

        If (From node In _NOTE...<title> Select node).Value = "" Then
            lvlist.Items.Add("No Title")
        Else
            lvlist.Items.Add((From node In _NOTE...<title> Select node).Value)
        End If

        lvlist.Items(lvlist.Items.Count - 1).SubItems.Add((From node In _NOTE...<group> Select node).Count)


End Sub

It reads XML document, counts tags and retrieves string value. That's all. After having those values, _NOTE (XDocument) is of no use at that time.

klados
  • 706
  • 11
  • 33
  • Did you try using XDocument instead? – Victor Zakharov Feb 20 '14 at 18:46
  • @Neolisk Can XDocument free up the memory when its work is done? – klados Feb 20 '14 at 19:48
  • Any .NET object is supposed to do this by default, just make sure you control the variable scope - it directly impacts the lifetime of your object, and as a result - your memory consumption. My hope was that XDocument may be consuming less memory than XmlDocument, because XDocument is relatively new to .NET (hoping that newer is better in this case). – Victor Zakharov Feb 20 '14 at 19:50
  • I've tried "(From node In xdocu... Select node).Value" thing and it successfully gives the right data, while no memory consumption improvement... I think this is because both XDocument and XMLDocument read the whole file at once then load in the memory. – klados Feb 20 '14 at 20:05
  • 1
    Yes, they both load the whole thing into memory. Can you show more of your code? So we could get the scope of your `document` variable (whether it's XmlDocument or XDocument), to estimate its life time. – Victor Zakharov Feb 20 '14 at 20:06
  • @Neolisk I wrote more code. Thanks! – klados Feb 20 '14 at 20:27
  • Interesting... so you are saying that even though `_NOTE` is declared inside the `Sub`, .NET does not free memory upon finishing execution of this `Sub` (give several seconds to allow garbage collection to kick in)? – Victor Zakharov Feb 20 '14 at 21:44
  • As suggested [here](http://social.msdn.microsoft.com/Forums/vstudio/en-US/2026a448-0dae-489c-9b33-954fad578c92/dispose-xdocument?forum=csharpgeneral), `When the reference goes out of scope and is not hold anymore, the GC will take over and clean/compact managed memory as needed.` How much memory does your PC have? Maybe you have enough, and that's why GC is not kicking? – Victor Zakharov Feb 20 '14 at 21:49
  • Also [try putting XmlDocument inside a Using statement](http://bytes.com/topic/net/answers/176972-xmldocument-load-method-not-releasing-memory) - see if it helps. This one below is unlikely to be relevant to your question, but I'll leave it here for future visitors. [.NET Memory not freeing up even after exiting the function (C#)](http://stackoverflow.com/questions/5847146/net-memory-not-freeing-up-even-after-exiting-the-function). – Victor Zakharov Feb 20 '14 at 21:50
  • 1
    @Neolisk Yes, even though the _NOTE is only used in the Sub, no apparent cleaning up is proceeded. My PC has 8GB of RAM memory, this might be the reason, but I have to manage the memory problem because loading 300MB XML takes about 2~4 times larger memory. – klados Feb 21 '14 at 05:16
  • plus, Using-End Using statement won't work because XMLDocument or XDocument itself doesn't have IDispose(). Syntax error occured. Thanks for concerning. – klados Feb 21 '14 at 05:20

2 Answers2

1

XmlReader will probably solve your need. From MSDN:

Represents a reader that provides fast, noncached, forward-only access to XML data.

Victor Zakharov
  • 25,801
  • 18
  • 85
  • 151
  • 1
    Good answer, however, it's worth mentioning that, though much less efficient, if you really want to use `XmlDocument`, you could just set your `XmlDocument` variable to `Nothing` once you're done reading the desired element value, then you can force a garbage collection by calling `GC.Collect`. That will cause it to immediately free up that memory that was used up by the `XmlDocument` object. Using `XmlReader` would be preferrable, though, if the file really is that big and you really only need one element value. – Steven Doggart Feb 20 '14 at 19:00
  • @StevenDoggart: Thank you, Steven. Your comment deserves to be promoted to an answer. :) – Victor Zakharov Feb 20 '14 at 19:17
  • @StevenDoggart Actually, I've tried that. I ended the code with setting docu=Nothing and GC.collect(). However, it doesn't free up the memory... Is XMLReader far different from XMLDocument? I mean, can XMLReader retrieve NODES and select single NODE and so on? thanks! – klados Feb 20 '14 at 19:37
  • 1
    @klados: See [How to: Parse XML with XmlReader](http://msdn.microsoft.com/en-us/library/cc189056(v=vs.95).aspx). Short answer - there is no 1-to-1 conversion approach. The benefits - you will have greater control with XmlReader, thus avoiding unnecessary reads or memory consumption. Convenience is usually the opposite of control. Pick one. – Victor Zakharov Feb 20 '14 at 19:48
  • @Neolisk then if I want to retrieve a tag in the specific tags which is more than 1000, should I re-open the XML file, read it from the very first and find it whenever I want to get ? – klados Feb 20 '14 at 20:31
  • @klados: Pretty much, yes. Read from top to bottom, pick anything of interest. Save that for further processing. – Victor Zakharov Feb 20 '14 at 21:42
  • 1
    I am converting XMLDocument to XMLReader. It's quite bothersome and not directive while apparently effective to memory management. – klados Feb 21 '14 at 09:17
0

Right, XmlDocument does not have a Dispose method. It is put into memory. It does follow an object life cycle. So if you create the object in a Function and return the string you want then it will free itself as it losses scope from the Function.

OneFineDay
  • 9,004
  • 3
  • 26
  • 37