0

In my application I have a rather large object created from some XML files. The xml file sizes something like 30MB, and my binary serialized object from this xml file will be like 8~9MB. Funny thing is if I compress this binary file with e.g. WinRar, it will be just 1~2MB.

Is there a way to increase compression level of the object itself? Or should I use another level of compression by manually write code for zipping the file after saving or unzip before loading back into the program?

In case, this is the code I use to save my object as file:

    public static bool SaveProject(Project proj, string pathAndName)
    {
        bool success = true;
        proj.FileVersion = CurrentFileVersion;

        try
        {
            IFormatter formatter = new BinaryFormatter();
            Stream stream = new FileStream(pathAndName, FileMode.Create, FileAccess.Write, FileShare.None);
            formatter.Serialize(stream, proj);
            stream.Close();
        }
        catch (Exception e)
        {
            MessageBox.Show("Can not save project!" + Environment.NewLine + "Reason: ", "Error",
                            MessageBoxButtons.OK, MessageBoxIcon.Exclamation);

            success = false;
        }

        return success;
    }

UPDATE I tried to change my code by adding a GZIPSTREAM but it seems that it does not do anything! Or maybe my implementation is wrong?

public static bool SaveProject(Project proj, string pathAndName)
{
    bool success = true;
    proj.FileVersion = CurrentFileVersion;

    try
    {
        IFormatter formatter = new BinaryFormatter();
        var stream = new FileStream(pathAndName, FileMode.Create, FileAccess.Write, FileShare.None);
        var gZipStream = new GZipStream(stream, CompressionMode.Compress);
        formatter.Serialize(stream, proj);
        stream.Close();
        gZipStream.Close();
    }
    catch (Exception e)
    {
        MessageBox.Show("Can not save project!" + Environment.NewLine + "Reason: ", "Error",
                        MessageBoxButtons.OK, MessageBoxIcon.Exclamation);

        success = false;
    }

    return success;
}

public static Project LoadProject(string path)
{
    IFormatter formatter = new BinaryFormatter();
    Stream stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.Read);
    var gZipStream = new GZipStream(stream, CompressionMode.Decompress);
    var obj = (Project)formatter.Deserialize(gZipStream);
    stream.Close();
    gZipStream.Close();

    if (obj.FileVersion != CurrentFileVersion)
    {
        throw new InvalidFileVersionException("File version belongs to an older version of the program.");
    }

    return obj;
}
Dumbo
  • 13,555
  • 54
  • 184
  • 288
  • What don't you try `GZipStream zip = new GZipStream(stream, System.IO.Compression.CompressionMode.Compress); formatter.Serialize(zip, proj);` and see. – L.B Aug 24 '12 at 09:20
  • @L.B Thaks, I tried but no change in file size! Check update please! – Dumbo Aug 24 '12 at 09:44
  • You still serialize to `stream`. use `gZipStream` [`*formatter.Serialize(stream, proj);*`] – L.B Aug 24 '12 at 09:44

1 Answers1

5

Wrap your FileStream in a DeflateStream with CompressionMode.Compress - pass that to the serializer. Then to deserialize, wrap a FileStream in a DeflateStream with CompressionMode.Decompress.

Note that instead of calling Close explicitly, you should use a using statement, e.g.

using (FileStream fileStream = ...)
using (DeflateStream deflateStream = new DeflateStream(fileStream, 
                                                      CompressionMode.Compress))
{
    formatter.Serialize(deflateStream, proj);
}

You can use GZipStream in the same way - try both to see which tends to give you better compression (or better performance, if you care about that).

Note how this approach separates the serialization aspect from the compression aspect, composing the two while keeping good separation of concerns. The serialization code just writes to a stream without caring what happens to the data, and the compression code just compresses what it's given without caring what the data means.

Roman Starkov
  • 59,298
  • 38
  • 251
  • 324
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Thanks Jon, I have to look into how can I use `Using` statement. But for now I tried the `GZipStream` but it doesn't do anything, can you check the update in my question please? – Dumbo Aug 24 '12 at 09:43
  • @Sean87: It doesn't do anything because you're still serializing to the *FileStream*. You should have `formatter.Serialize(gzipStream, proj);` – Jon Skeet Aug 24 '12 at 09:44
  • Now it works, but using gzip it only reduced the file size from 7.7 to 6.8....hardly 1MB! any ideas ?! – Dumbo Aug 24 '12 at 10:14
  • @Sean87: Not really. What about deflate? Note that there are other options in the constructors to specify how hard to try to compress. – Jon Skeet Aug 24 '12 at 10:56
  • 4
    Do _not_ use Microsoft's .NET GZipSteam or DeflateStream classes as recommended by the venerable Jon Skeet. Use DotNetZip's (http://dotnetzip.codeplex.com/) instead. The .NET classes are horribly buggy in compression and integrity checking, noted in an answer here: http://stackoverflow.com/questions/11435200/why-does-my-c-sharp-gzip-produce-a-larger-file-than-fiddler-or-php . – Mark Adler Aug 24 '12 at 14:46
  • @MarkAdler Yeah Thanks! I used that library, it is awsome, using the highest compression mode it reduced my file from 7.7 to 1.7 MB! But what I actually asked here was is there even a way to do compression (I didn't know it is possible untill Jon Skeet opened a whole new world to my eyes :D :D) – Dumbo Aug 24 '12 at 21:57