C# DeflateStream vs Java DeflaterOutputStream

Question

In Java, this works as expected:

  public static void testwrite(String filename) throws IOException {
    FileOutputStream fs = new FileOutputStream(new File(filename), false);
    DeflaterOutputStream fs2 = new DeflaterOutputStream(fs, new Deflater(3));
    for (int i = 0; i < 50; i++)
      for (int j = 0; j < 40; j++)
        fs2.write((byte) (i + 0x30));
    fs2.close();
  }

  public static void testread(String filename) throws IOException {
    FileInputStream fs = new FileInputStream(new File(filename));
    InflaterInputStream fs2 = new InflaterInputStream(fs);
    int c, n = 0;
    while ((c = fs2.read()) >= 0) {
      System.out.print((char) c);
      if (n++ % 40 == 0)  System.out.println("");
    }
    fs2.close();
  }

The first method compresses 2000 chars in a 106 bytes file, the second reads it ok.

The equivalent in C# would seem to be

private static void testwritecs(String filename) {
    FileStream fs = new FileStream(filename, FileMode.OpenOrCreate);
    DeflateStream fs2 = new DeflateStream(fs,CompressionMode.Compress,false);
    for (int i = 0; i < 50; i++)   {
        for(int j = 0; j < 40; j++)
                fs2.WriteByte((byte)(i+0x30));
    }
    fs2.Flush();
    fs2.Close();
}

But it generates a file of 2636 bytes (larger than the raw data, even though it has low entropy) and is not readable with the Java testread() method above. Any ideas?

Edited: The implementation is indeed not standard/portable (this bit of the docs: "an industry standard algorithm" seems a joke), and very crippled. Among other things, its behaviour changes radically if one writes the bytes one at a time or in blocks (which goes against the concept of a "stream"); if I change the above

for(int j = 0; j < 40; j++)
   fs2.WriteByte((byte)(i+0x30));

by

byte[] buf = new byte{}[40;
for(int j = 0; j < 40; j++)
   buf[j]=(byte)(i+0x30));
fs2.Write(buf,0,buf.Length);

the compression gets (slightly) reasonable. Shame.

See also: http://stackoverflow.com/a/11435898/277304 – leonbloy Aug 08 '12 at 14:41 — leonbloy, Aug 08 '12 at 14:41

score 3 · Accepted Answer · edited May 23 '17 at 12:12

3

Don't use DeflateStream on anything except plain ASCII text, because it uses statically defined, hardcoded Huffman trees built for plain ASCII text. See my prior answer for more detail, or just use SharpZipLib and forget it.

edited May 23 '17 at 12:12

Community

1
1

answered Sep 20 '11 at 00:48

Jeffrey Hantin

35,734
7
75
94

Sheez, I didn't expect some part of the BCL to be so bad. As pointed out by other comment, it seems to be oriented to internal use, but then the docs ("industry standard algorithm ") are misleading, to put it mildly. Not only it is incompatible with Java's deflater/inflater (which IS standard), but it's a very crippled implementation (se e my edit) SharpZipLib implementation is right on spot. – leonbloy Sep 20 '11 at 12:48

C# DeflateStream vs Java DeflaterOutputStream

1 Answers1