The most likely reason you are getting a OutOfMemoryException
in ReadOuterXml()
is that you are trying to read in a substantial portion of the 1 GB XML document into a string, and are hitting the Maximum string length in .Net.
So, don't do that. Instead load directly from the XmlReader
using XDocument.Load()
with XmlReader.ReadSubtree()
:
using (var xr = XmlReader.Create(fileName))
{
while (xr.Read() && success)
{
if (xr.NodeType != XmlNodeType.Element)
continue;
switch (xr.Name)
{
case "A":
{
// ReadSubtree() positions the reader at the EndElement of the element read, so the
// next call to Read() moves to the next node.
using (var subReader = xr.ReadSubtree())
{
var doc = XDocument.Load(subReader);
GetDetails(doc);
}
}
break;
}
}
}
And then in GetDetails()
do:
private int GetDetails(XDocument rootDocument)
{
var xnodes = rootDocument.XPathSelectElements("//A/B").ToList();
//Then working on list of nodes
return xnodes.Count;
}
Not only will this use less memory, it will also be more performant. ReadOuterXml()
uses a temporary XmlWriter
to copy the XML in the input stream to an output StringWriter
(which you then parse a second time). This version of the algorithm completely skips this extra work. It also avoids creating strings large enough to go on the large object heap which can cause additional performance issues.
If this is still using too much memory you will need to implement SAX-like parsing for your XML where you only load one element <B>
at a time. First, introduce the following extension method:
public static partial class XmlReaderExtensions
{
public static IEnumerable<XElement> WalkXmlElements(this XmlReader xmlReader, Predicate<Stack<XName>> filter)
{
Stack<XName> names = new Stack<XName>();
while (xmlReader.Read())
{
if (xmlReader.NodeType == XmlNodeType.Element)
{
names.Push(XName.Get(xmlReader.LocalName, xmlReader.NamespaceURI));
if (filter(names))
{
using (var subReader = xmlReader.ReadSubtree())
{
yield return XElement.Load(subReader);
}
}
}
if ((xmlReader.NodeType == XmlNodeType.Element && xmlReader.IsEmptyElement)
|| xmlReader.NodeType == XmlNodeType.EndElement)
{
names.Pop();
}
}
}
}
Then, use it as follows:
using (var xr = XmlReader.Create(fileName))
{
Predicate<Stack<XName>> filter =
(stack) => stack.Peek().LocalName == "B" && stack.Count > 1 && stack.ElementAt(1).LocalName == "A";
foreach (var element in xr.WalkXmlElements(filter))
{
//Then working on the specific node.
}
}