How to match ZLib stream between VBA 6/VBA 7and Java 8?

Question

We are being able to do the following.

In VBA 6/ VBA 7:

Refer a 32 bit zlibwapi.dll (VBA 6) or 64 bit zlibwapi.dll (VBA 7).
Invoke compress() or compress2() methods to generate compressed
streams
Invoke uncompress() and uncompress2() methods to decompress compressed streams

In Java 8 (JDK 1.8 on Tomcat 8)

Have a simple java program that compresses data using the new Deflater() instance
Have a simple Java program that decompresses using Inflater() instance

We are failing when VBA sends out the compressed stream for Java Servlet to uncompress or when Java Servlet sends out compressed response data for VBA to decompress.

We are aware of following facts.

there are 3 formats provided by ZLib (raw, zlib and gzip).
The methods in zlibwapi.dll namely compress() and compress2() generates compressed bytes in zlib format. This has been mentioned in a similar thread at Java decompressing array of bytes
Inflater() instance on Java side allows to uncompress zlib format data as per a code sample posted at Compression / Decompression of Strings using the deflater
Java 8 has zlib version 1.2.5 integrated as part of java.utils.zip package.
We have ensured that we are using zlibwapi.dll version 1.2.5 on VBA side as well.

We have tried to use Hex editors to compare byte streams of compressed data independently generated by VBA and Java as well. We notice some difference in the generated compressed data. We think it is this difference that is causing both the environments to misunderstand each other.

Additionally, we think that when communication occurs, there has to be some common charset that governs the encoding/decoding scheme between both the endpoints. We have even tried to compare the hex code of byte stream generated by VBA and communicated across to Java Servlet.

The bytes seem to be getting some additional 0 bytes inserted in between the actual set of compressed bytes while communication occurs. This happens on VBA side. May be because of some unicode interpretation.
Whatever bytes get communicated across to Java appear entirely different in their representation.

We need to fix our independently working code to communicate with one another and compress and decompress peacefully. We think there are 2 things to address - Getting format to match and using a charset that sends bytes as is. We are looking for any assistance from experts on this front that can help us find correct path to the possible solution. We need answers for

Does compress2() or compress() really generate zlib format?
Which charset will allow us to send bytes as is (if there are 10 bytes, we want to send 10 bytes. Not 20). If its unicode, 0 bytes get inserted in between (10 bytes become 20 bytes because of this).

score 1 · Accepted Answer · answered Sep 04 '17 at 17:54

1

Yes.
Don't send characters. Send bytes.

answered Sep 04 '17 at 17:54

Mark Adler

101,978
13
118
158

Thanks - that helped in communicating VBA request to Java Servlet. Now other way round is causing issues (Java response back to VBA). We noticed your answer at https://stackoverflow.com/questions/19120676/how-to-detect-type-of-compression-used-on-the-file-if-no-file-extension-is-spe. We want to write our own isValidZLibFormat(bytes[]) but need a little more light on 0aaa1000 bbbccccc byte structure explained by you out there. What is a, b, c? Example please? – sidnc86 Sep 07 '17 at 18:03
Will this work: if(((firstByte * 256) + secondByte) % 31 == 0) return true; else return false;? – sidnc86 Sep 07 '17 at 18:10
Each letter is a bit. You can replace the a's and b's with any bits you like (there are 64 possibilities), but then you need to set the c's so that the integer formed by the two bytes in network order is a multiple of 31. Your if statement is part of what you can check. You should also check the fixed bits, so `&& (firstByte & 0x8f) == 8`. That doesn't tell you for certain that you have a zlib stream, but it does tell you that the first two bytes are a zlib header. Note that random data will appear to have a valid zlib header about 0.1% of the time. – Mark Adler Sep 07 '17 at 19:40
In your check && (firstByte & 0x8F) == 8, are you suggesting that aaa bits are always 0? – sidnc86 Sep 08 '17 at 06:30
Do you know what `&` does? – Mark Adler Sep 09 '17 at 01:49
Frankly - no I don't know what it does. We are currently getting 120 and not 8. For now, Java generates 120,-100 as the first two bytes these are interpreted as 120, 156 ON BBA side. – sidnc86 Sep 09 '17 at 05:12
`&` is bit-wise and. `& 0x8f` leaves the first and last four bits as is, and forces the other bits (the `aaa`) to zeros. 120 is `0x78`. That anded with `0x8f` is 8. -100 is the same as 156 when encoded into 8 bits, which is `0x9c`. 120*256 + 156 is divisible by 31. – Mark Adler Sep 09 '17 at 06:39
Thanks for those insights! And sorry for the confusion caused. I know what & does (bitwise). Being blind and using screen reader, I misheard '&' (and) as 'it' – sidnc86 Sep 09 '17 at 11:59
Also sorry for another confusion. I confused 0x8F with 0x7F. I ended up anding with 127 instead of 143 – sidnc86 Sep 09 '17 at 12:08

How to match ZLib stream between VBA 6/VBA 7and Java 8?

1 Answers1