4

PHP code:

$txt="John has cat and dog."; //plain text
$txt=base64_encode($txt); //base64 encode
$txt=gzdeflate($txt,9); //best compress
$txt=base64_encode($txt); //base64 encode
print_r($txt); //print it

Below code return:

C861zE/KdMqPjPBNjzRyM/B0dyuNcnbKTjJKLgUA

I'm trying compress string in Java.

        // Encode a String into bytes
     String inputString = "John has cat and dog.";
     inputString=Base64.encode(inputString);

     byte[] input = inputString.getBytes("UTF-8");

     // Compress the bytes
     byte[] output = new byte[100];
     Deflater compresser = new Deflater();
    //compresser.setLevel(Deflater.BEST_COMPRESSION);
     compresser.setInput(input);
     compresser.finish();
     int compressedDataLength = compresser.deflate(output);     
     String outputString = new String(output, 0, compressedDataLength,"UTF-8");     
     outputString=Base64.encode(outputString);  
     System.out.println(outputString);      

But print wrong string: eD8L

Pz9PP3Q/Pz9NPzRyMz90dys/cnY/TjJKLgUAPygJTA==

must be:

C861zE/KdMqPjPBNjzRyM/B0dyuNcnbKTjJKLgUA

How fix it? Thanks.

Wanna Coffee
  • 2,742
  • 7
  • 40
  • 66
  • Which library is this base64 class from? – Nivas Dec 20 '12 at 23:15
  • 6
    It might help if you print out the value of `$txt` at each step in both PHP and Java so you can compare and see at which step they're different. – Bill the Lizard Dec 20 '12 at 23:16
  • 1
    Seconding Bill's suggestion here. Make sure that the strings as still the same after the deflation... I suspect they're not. – Ben D Dec 20 '12 at 23:17
  • The most common mistake made in this sort of thing is to take binary data (such as the output from "Deflater") and treat it as a character string. It's not characters, it's binary data, and you must maintain it as a byte stream/array of some sort until you run it through Base64 encoding to make it into characters. – Hot Licks Dec 21 '12 at 02:26

2 Answers2

9

Use Deflater like this :

ByteArrayOutputStream stream = new ByteArrayOutputStream();
Deflater compresser = new Deflater(Deflater.BEST_COMPRESSION, true);
DeflaterOutputStream deflaterOutputStream = new DeflaterOutputStream(stream, compresser);
deflaterOutputStream.write(input);
deflaterOutputStream.close();
byte[] output = stream.toByteArray();

To decompress what is compressed:

    ByteArrayOutputStream stream2 = new ByteArrayOutputStream();
    Inflater decompresser = new Inflater(true);
    InflaterOutputStream inflaterOutputStream = new InflaterOutputStream(stream2, decompresser);
    inflaterOutputStream.write(output);
    inflaterOutputStream.close();
    byte[] output2 = stream2.toByteArray();
Akdeniz
  • 1,260
  • 11
  • 21
  • I'm trying decompress string
    byte[] B_output =Base64.decode(outputString);
        ByteArrayOutputStream stream2 = new ByteArrayOutputStream(B_output.length);
        Inflater decompresser = new Inflater();
        decompresser.setInput(B_output);
        InflaterOutputStream inflaterOutputStream = new InflaterOutputStream(stream2, decompresser);
        inflaterOutputStream.write(B_output);
        inflaterOutputStream.close();
        byte[] output2 = stream2.toByteArray();
        String o3=output2.toString();
        o3 = Base64.decode(o3).toString();
        System.out.println(o3); 
    But I get "incorrect header check"
    – Jarosław Maciejewski Dec 21 '12 at 03:54
  • 1
    hm... I'm trying compress and decompress long text (12 000 size), but I get error: Exception in thread "main" java.util.zip.ZipException: invalid bit length repeat at java.util.zip.InflaterOutputStream.write(Unknown Source) at java.io.FilterOutputStream.write(Unknown Source) at gzip.decode(gzip.java:26) at main.main(main.java:37) my code: http://pastebin.com/BDf95Hu2 – Jarosław Maciejewski Dec 22 '12 at 15:45
  • As @Hot Licks commented to your question, you should not wrap binary data into String, that you are not sure about encoding of its content. – Akdeniz Dec 23 '12 at 16:35
  • @Akdeniz can you tell me how to decompress the output in nodeJs ? – Holasmabre Jul 25 '22 at 09:38
0
 String outputString = new String(output, 0, compressedDataLength,"UTF-8");     

You are taking some compressed data and trying to interpret it as a UTF-8 string. This is unsafe, and is resulting in the encoded string containing a bunch of "?"s instead of the intended data.