0

I was doing some code review and have come across the below code

private static string ZipStream(string sxml)
    {
        byte[] binaryData;
        var bytes = Encoding.UTF8.GetBytes(sxml);
        int x = sxml.Length;
        using (var msi = new MemoryStream(bytes))
        {
            using (var mso = new MemoryStream())
            {
                using (var gs = new GZipStream(mso, CompressionMode.Compress))
                {
                    byte[] bytes2 = new byte[600000];
                    int count;
                    while ((count = msi.Read(bytes2, 0, bytes.Length + 1000)) != 0)
                    {
                        gs.Write(bytes2, 0, count);
                    }
                }
                binaryData = mso.ToArray();
                return Convert.ToBase64String(binaryData);
            }
        }

    }

I found it difficult to understand the above code in one reading and also found that if the input string is more than 600000 bytes the above code will throw error. So i updated it to below code. When i ran both pieces of code in a console ap in a loop for 10000 times on a string of length 20 chars, the first version takes 2.4MB of memory and second one(my updated code) takes 6.5MB of memory. For the life of me I can't seem to understand why is this happening?

    private static string ZipStreamUpdated(string sxml)
    {
        var bytes = Encoding.UTF8.GetBytes(sxml);
        using (var mso = new MemoryStream(bytes.Length))
        {
            using (var gs = new GZipStream(mso, CompressionMode.Compress))
            {
                gs.Write(bytes, 0, bytes.Length);
            }
            return Convert.ToBase64String(mso.ToArray());
        }
    }
  • 1
    Because you are running in a loop for a string only 20 chars in length, I think this could be a difference in when/why the GC decides to free up memory from previous iterations. I don't know the absolute correct way to verify, but I think you can test by adding `GC.Collect(); GC.WaitForPendingFinalizers(); GC.Collect();` (collect() twice) at the end of the loop so that all GC'able memory is freed before the next loop starts. – Quantic Oct 20 '16 at 20:28
  • `When i ran both pieces of code in a console ap in a loop for 10000 times on a string of length 20 chars` I would try it with `10000 times with a string of length 200000000 ` – L.B Oct 20 '16 at 20:31
  • @Quantic - that did the trick. Initially I had the GC.Collect() outside of the loop, once I moved it to the inside the memory seems to get freed up, but this means that the GC happens 10k times. Any idea on why adding the GC.Collect once outside of the loop is not freeing up memory? – Sri Harsha Velicheti Oct 20 '16 at 20:54
  • GC is complicated and I don't fully understand it. The gist is that it does what it wants when it wants, and your job is to trust it and never touch it. A few random links: `GC.Collect()` does not free [immediately](http://stackoverflow.com/a/888314/5095502), GC not happening even [when needed](http://stackoverflow.com/questions/10016541/garbage-collection-not-happening-even-when-needed), GC unpredictable, strongly advised [not to touch it](http://stackoverflow.com/a/28359914/5095502), and hundreds more if you search around here. – Quantic Oct 20 '16 at 21:04

0 Answers0