0

I am reading very large Xml files (400+ MB), 15 MB zipped that is downloaded and unzipped into MemoryStream. I am running into a System.OutOfMemoryException every time. I tried using StreamReader.ReadToEnd() and read it to a string, doesn't work.

I googled around and am using XmlReader and loading it into XElement as suggested by posts here. However, I am still running into the OutOfMemoryException error.

string downloadUrl = requestStatus.ReportDownloadUrl;

//create a network stream to the report Url
using (Stream reportZipStream = new WebClient().OpenRead(downloadUrl)) //download the file
using (Stream reportZipMemoryStream = new MemoryStream()) //initilize zip memorystream
using (Stream reportXmlStream = new MemoryStream()) //load xml file to memorystream for manipulation
{
    //copy zip file to memorystream
    reportZipStream.CopyTo(reportZipMemoryStream);
    reportZipMemoryStream.Seek(0, SeekOrigin.Begin);

    //unzip to Xml memory stream
    using (ZipFile reportZip = ZipFile.Read(reportZipMemoryStream))
    {
        reportZip[0].Extract(reportXmlStream);
    }

    reportXmlStream.Seek(0, SeekOrigin.Begin);

    Dictionary<string, object> parsedXml = default(Dictionary<string, object>);

    //read and parse
    if (reportXmlStream.CanRead && reportXmlStream.Length > 0)
    {
        XmlDataParser parser = new XmlDataParser();
        using (XmlReader reader = XmlReader.Create(reportXmlStream))
        {
            XElement elem = XElement.Load(reader); //out of memory error here
            parsedXml = parser.doParse(elem);
        }
    }
Kyle
  • 5,407
  • 6
  • 32
  • 47

1 Answers1

1

You should not use DOM Parser (such XElement) for 400 Mo XML Files. You should use SAX parsers.

Perfect28
  • 11,089
  • 3
  • 25
  • 45