Size of ram used increase abnormally after serialization/deserialization

Question

I using following methods to save application data into a file after serialization and load data from this file after deserialization (en/decrypted).

private void SaveClassToFile(string fileAddress, string password, object classToSave)
{
    const int ivSaltLength = 16;
    byte[] salt = new byte[ivSaltLength];
    byte[] iv = new byte[ivSaltLength];
    byte[] codedClass = new byte[0];

    iv = CreateIV();
    salt = CreateSalt();

    using (var memoryStream = new MemoryStream())
    {
        BinaryFormatter binaryFormatter = new BinaryFormatter();
        binaryFormatter.Serialize(memoryStream, classToSave);
        codedClass = new byte[Convert.ToInt32(memoryStream.Length)];
        memoryStream.Seek(0, SeekOrigin.Begin);

        if (memoryStream.Read(codedClass, 0, Convert.ToInt32(memoryStream.Length)) != memoryStream.Length)
        {throw new Exception("failed to read from memory stream"); }
    }

    using (SpecialCoderDecoder specialCoder = new SpecialCoderDecoder(SpecialCoderDecoder.Type.Coder, password, salt, iv))
    { specialCoder.Code(codedClass); }

    using (FileStream streamWriter = new FileStream(fileAddress, FileMode.CreateNew))
    using (BinaryWriter binaryWriter = new BinaryWriter(streamWriter))
    {
        binaryWriter.Write(salt);
        binaryWriter.Write(iv);
        binaryWriter.Write(codedClass);
    }
}

private object LoadClassFromFile(string fileAddress, string password)
{
    const int ivSaltLength = 16;
    byte[] salt = new byte[ivSaltLength];
    byte[] iv = new byte[ivSaltLength];
    byte[] codedClass = new byte[0];
    int codedClassLengthToRaed = 0;
    FileInfo fileInfo;
    object result = null;
    fileInfo = new FileInfo(fileAddress);

    using (FileStream streamWriter = new FileStream(fileAddress, FileMode.Open))
    using (BinaryReader binaryreader = new BinaryReader(streamWriter))
    {
        salt = binaryreader.ReadBytes(ivSaltLength);
        iv = binaryreader.ReadBytes(ivSaltLength);
        codedClassLengthToRaed = Convert.ToInt32(fileInfo.Length) - (2 * ivSaltLength);
        codedClass = binaryreader.ReadBytes(codedClassLengthToRaed);
    }

    using (SpecialCoderDecoder specialDecoder = new SpecialCoderDecoder(SpecialCoderDecoder.Type.Decoder, password, salt, iv))
    { specialDecoder.Decode(codedClass); }

    using (MemoryStream memoryStream = new MemoryStream())
    {
        BinaryFormatter binaryFormatter = new BinaryFormatter();
        memoryStream.Write(codedClass, 0, codedClass.Length);

        memoryStream.Seek(0, SeekOrigin.Begin);
        result = (object)binaryFormatter.Deserialize(memoryStream);
    }

    return result;
}

If there be no data in application and I add about 100MB data to it (base on task manager) and save it. After loading data, task manager shows that application data is about 200-400 MB!

For encapsulating application class to one class for use this methods I use a class like:

public class BigClass
{
    public  ClassA classA;
    public ClassB classB;

    public BigClass(ClassA a, ClassB b)
    {
        classA = a;
        classB = b;
    }
}

That each one of ClassA and ClassB (classes that should saves/loads) are like:

public class ClassA
{
    List<ClassASub> list = new List<ClassASub>();

    //some variables...

    //some methodes

    private class ClassASub
    {
        int intValue;
        long longValue;
        string stringValue;
        Image image;

        //some simple methodes....
    }
}

I do not talk about size of used RAM in serialization/deserialization progress. I speak about used RAM after that, when only application data should exists.

I would suggest this blog post by Raymond Chen: [Everybody thinks about garbage collection the wrong way](https://blogs.msdn.microsoft.com/oldnewthing/20100809-00/?p=13203) — asawyer, Jun 09 '17 at 13:44
I see at list one Image Property in your Class A that should be disposed, once you are done with your object :-) — Laurent Lequenne, Jun 09 '17 at 14:42

Marc Gravell · Answer 1 · 2017-06-09T13:43:51.420

You're loading the data into memory as an array (codedClass). This array, by your indication, is presumably around 100MB, which is more than enough to ensure that it gets allocated on the Large Object Heap.

Now: GC is designed to optimize your overall system performance; it is not designed to aggressively reclaim memory constantly, for multiple reasons:

it is unnecessary overhead if you have lots of memory free in your system (you're not under memory pressure) and there's no specific need to collect
some data is more expensive to collect than others; the most expensive of these is the Large Object Heap, so it goes to the back of the queue; other memory is released first
even if data was free, it isn't necessarily advantageous to release those pages back to the OS; the process could validly decide to hold onto them to avoid constantly asking the OS for memory and handing it back

In your case, you could try using the methods on System.GC to forcibly run a collection, but I think the real goal would be to not allocate those big arrays. If you can do anything to move to a Stream-based model rather than an array-based model, that would be good. This would presumably mean changing SpecialCoderDecoder significantly.

Key point: the upper size of an array hard a hard cap; you cannot scale your current implementation beyond 2GB (even if <gcAllowVeryLargeObjects> is enabled).

Additionally, I suspect that BinaryFormatter is exacerbating things - it almost always does. Alternative more efficient well-worn serializers exist. Reducing the serialized size would be an alternative option to consider, either instead of - or in combination with - moving to a Stream-based model.

Additionally additionally: you could try using compression techniques (GZipStream, DeflateStream, etc) inside the encrypted payload. You shouldn't attempt to compress encrypted data - you need to ensure that the order is:

Serialize -> Compress -> Encrypt -> (storage) -> Decrypt -> Decompress -> Deserialize

The serialization and compression stages are already fully Stream-compatible. If you can make the encryption layer Stream-compatible, you're onto a winner.

♦ thanks for your answer. but when I delete some data it cause decreasing in used memory (base on task manager). for example after loading, if I remove half of data, used ram will become about half. does not it means that problem is somewhere else and each class size increased? what you says is true but after ending loading progress I can see that assigned space frees. or I thinks wrong and it does not mean what I think? — Student, Jun 09 '17 at 14:19
@Student I think it would require much more dedicated investigation than I can provide here to give a conclusive answer to that, — Marc Gravell, Jun 09 '17 at 15:21
♦ thanks for your answer and your time. I should test and search more. if I finded a good answer I will post it here — Student, Jun 09 '17 at 16:53

score 0 · Answer 2 · answered Jun 09 '17 at 13:47

The Classes you have created hold huge amount of data (Class A, Class B, BigClass for example). Whenever you create and use classes like these which hold many data (specially value type) you have to tell the runtime to destroy (or Dispose) them when you don't need them anymore. This is called "The Dispose Pattern" and you can find out more about it here:

https://learn.microsoft.com/en-us/dotnet/standard/design-guidelines/dispose-pattern

Some .net Classes have built in Dispose() method so that the ,Net Garbage Collector (GC) knows when to wipe them out of memory. but not all of them. for those who have Dispose() and implement IDisposable interface you can use "Using" statement to automatically dispose them when their task has finished (you have used some using statement in your code but not in all required places).

Simple answer is: your data remain in memory after serialization has finished. make your classes disposable and dispose them when you don't need them.

[This question is helpful for you: When should I dispose my objects in .NET? ]

Size of ram used increase abnormally after serialization/deserialization

2 Answers2