2

I'm trying to serialize an object into a string The first problem I encountered was that the XMLSerializer.Serialize method threw an Out of memory exception, I've trying all kind of solutions and none worked so I serialized it into a file. The file is about 300mb's (32 bit process, 8gb ram) and trying to read it with StreamReader.ReadToEnd also results in Out of memory exception.

The XML format and loading it on a string are not an option but a must. The question is:

  • Any reason that a 300mb file will throw that kind of exception? 300mb is not really a large file.

Serialization code that fails on .Serialize

using (MemoryStream ms = new MemoryStream())
{
    var type = obj.GetType();
    if (!serializers.ContainsKey(type))
        serializers.Add(type,new XmlSerializer(type));

   // new XmlSerializer(obj.GetType()).Serialize(ms, obj);
    serializers[type].Serialize(ms, obj);
    ms.Position = 0;

    using (StreamReader sr = new StreamReader(ms))
    {
        return sr.ReadToEnd();
    }                  
}

Serialization and read from file that fails on ReadToEnd var type = obj.GetType(); if (!serializers.ContainsKey(type)) serializers.Add(type,new XmlSerializer(type));

FileStream fs = new FileStream(@"c:/temp.xml", FileMode.Create);
TextWriter writer = new StreamWriter(fs, new UTF8Encoding());
serializers[type].Serialize(writer, obj);
writer.Close();
fs.Close();
using (StreamReader sr = new StreamReader(@"c:/temp.xml"))
{
   return sr.ReadToEnd();
}

The object is large because its an elaborate system entire configuration object...

UPDATE: Reading the file in chucks (8*1024 chars) will load the file into a StringBuilder but the builders fails on ToString().... starting to think there is no way which is really strange.

Amorphis
  • 398
  • 2
  • 19
  • 1
    Share some code, how do you serialize, or how do you read the file. Stack trace might also be helpful. – Michael Dec 16 '14 at 12:57
  • 1
    I am curious to know how that object size become 300 MB. Also, do you expect concurrent users? – Amit Dec 16 '14 at 12:57
  • If the xml files are that big and .net has problems to load them I guess their structure is relatively flat so that the other software can handle them and write into that files line by line or in chunks etc. Can that be true? If so you could then parse the files manully line by line... would that be an option? – t3chb0t Dec 16 '14 at 13:22
  • The question is why do I get an Out of memory exception when trying to load a 300mb file onto a string. it doesn't matter what the contents of that string is. I know .net has problems with big XML Serialization/Deserializations and that is what I am trying to avoid. – Amorphis Dec 16 '14 at 13:29

2 Answers2

4

Yeah, if you're using 32-bit, trying to load 300MB in one chunk is going to be awkward, especially when using approaches that don't know the final size (number of characters, not bytes) in advance, thus have to keep doubling an internal buffer. And that is just when processing the string! It then needs to rip that into a DOM, which can often take several times as much space as the underlying data. And finally, you need to deserialize it into the actual objects, usually taking about the same again.

So - indeed, trying to do this in 32-bit will be tough.

The first thing to try is: don't use ReadToEnd - just use XmlReader.Create with either the file path or the FileStream, and let XmlReader worry about how to load the data. Don't load the contents for it.

After that... the next thing to do is: don't limit it to 32-bit.

Well, you could try enabling the 3GB switch, but... moving to 64-bit would be preferable.

Aside: xml is not a good choice for large volumes of data.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • I'll try your recommendation with XmlReader about the rest 1. 32 is a must because the system is using 32bit libraries that I have no control of. 2. The data is in XML format and I have no control over it, converting it to JSON doesn't make sense. – Amorphis Dec 16 '14 at 13:04
  • 2
    Actually the main point here is Marc remark on XML not being a good choice for big data files: too wordy. JSon even CSV (if flat enough) would be better. – Askolein Dec 16 '14 at 13:06
  • You seem to be moving off track. XML format and Text are a must not an option. – Amorphis Dec 16 '14 at 13:09
  • 1
    @Amorphis k, but here's the thing; it simply isn't a good choice. If a bad choice is a "must", then whoever designed the architecture needs a dry slap... – Marc Gravell Dec 16 '14 at 17:34
  • Most of us don't work in an environment we take all the decisions and make all the designs, most of us need to work with the limitations inherited from previous developers and bad choices made in different business units. :) this is life. we need to learn how to work with it. – Amorphis Dec 17 '14 at 07:57
  • @Amorphis k; what is the data here? if it is some kind of list, there are ways of using xml sub-readers to read the elements individually - divide and conquer, etc – Marc Gravell Dec 17 '14 at 07:58
  • Its worse.... Its a global object that holds complex child objects and so on.... We'll probably create simpler objects list and work with that in case of lots of data. – Amorphis Dec 17 '14 at 08:02
1

Exploring the source code for StreamReader.ReadToEnd reveals that it internally makes use of the StringBuilder.Append method:

public override String ReadToEnd()
{
    if (stream == null)
        __Error.ReaderClosed();

#if FEATURE_ASYNC_IO
    CheckAsyncTaskInProgress();
#endif

    // Call ReadBuffer, then pull data out of charBuffer.
    StringBuilder sb = new StringBuilder(charLen - charPos);
    do {
        sb.Append(charBuffer, charPos, charLen - charPos);
        charPos = charLen;  // Note we consumed these characters
        ReadBuffer();
    } while (charLen > 0);
    return sb.ToString();
}

which most probably throws this exception that leads to the this question/answer: interesting OutOfMemoryException with StringBuilder al

Community
  • 1
  • 1
t3chb0t
  • 16,340
  • 13
  • 78
  • 118
  • 1
    Its interesting, but it looks like there is no way to load a medium size file onto a string in one chunk. The stuff about bad design is BS, I can recreate it with a single button form that does only that. – Amorphis Dec 16 '14 at 13:51