1

I want to http post a gzip compressed data from python to java and I want to store it as a BLOB in database. Then I want to gzip decompress that BLOB in java. So I want to know howto post a BLOB in python and how to read a BLOB in java. I have given my python and java code below. In my code I gzip compress a string in python and store that compressed data in a file. Then I read that file in java and decompress it using GZIPInputStream. But I'm getting the below exception.

java.io.IOException: Not in GZIP format
    at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:154)
    at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:75)
    at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:85)
    at GZipFile.gunzipIt(GZipFile.java:60)
    at GZipFile.main(GZipFile.java:43)

If I print the byte array of the compressed data in python I get

[31, 139, 8, 0, 254, 213, 186, 87, 2, 255, 203, 72, 205, 201, 201, 231, 229, 42, 207, 47, 202, 73, 1, 0, 66, 102, 86, 48, 12, 0, 0, 0]

If I read and print that compressed data from that file in java I get as

[31, -17, -65, -67, 8, 0, -17, -65, -67, -42, -70, 87, 2, -17, -65, -67, -17, -65, -67, 72, -17, -65, -67, -17, -65, -67, -17, -65, -67, -17, -65, -67, -17, -65, -67, 42, -17, -65, -67, 47, -17, -65, -67, 73, 1, 0, 66, 102, 86, 48, 12, 0, 0, 0]

Has you can see there is difference. If I give the printed byte array in python as input to the java code it works fine. So please help me to know how to post a blob(the compressed data) in python and how to read that compressed data in java to decompress it.

This is the compression code in python:

import StringIO  
import gzip  
import base64  
import os  


m='hello'+'\r\n'+'world'  

out = StringIO.StringIO()  
with gzip.GzipFile(fileobj=out, mode="wb") as f:  

    f.write(m.encode('utf-8'))
print list(array.array('B',out.getvalue())[:])
f=open('comp_dump','wb')  
f.write(out.getvalue())  
f.close()

This is the decompression code in java:

//$Id$

import java.io.*;  
import java.io.FileInputStream;  
import java.io.FileOutputStream;  
import java.io.IOException;  
import java.util.zip.GZIPInputStream;  
import javax.xml.bind.DatatypeConverter;  
import java.util.Arrays;

public class GZipFile
{


public static String readCompressedData()throws Exception
{
        String compressedStr ="";
        String nextLine;
        BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream("comp_dump")));
        try
        {
                while((nextLine=reader.readLine())!=null)
                {
                        compressedStr += nextLine;
                }
        }
        finally
        {
                reader.close();
        }
        return compressedStr;
}

public static void main( String[] args ) throws Exception
{
        GZipFile gZip = new GZipFile();
        byte[] contentInBytes = readCompressedData().getBytes("UTF-8");

        System.out.println(Arrays.toString(contentInBytes));
        String decomp = gZip.gunzipIt(contentInBytes);
        System.out.println(decomp);
}

/**
 * GunZip it
 */
public static String gunzipIt(final byte[] compressed){

        byte[] buffer = new byte[1024];
        StringBuilder decomp = new StringBuilder() ;

        try{

                GZIPInputStream gzis = new GZIPInputStream(new ByteArrayInputStream(compressed));

                int len;
                while ((len = gzis.read(buffer)) > 0) {

                        decomp.append(new String(buffer, 0, len));

                }

                gzis.close();

        }catch(IOException ex){
                ex.printStackTrace();
        }
        return decomp.toString();
}
}

2 Answers2

0

Have you checked this: gzip a file in Python ?

My guess is that your string

m='hello'+'\r\n'+'world' 

is possibly causing some issues with the whole process...

Have you considered replacing it with m="hello\r\nworld", using double quotes instead?

Community
  • 1
  • 1
Bruno Oliveira
  • 735
  • 10
  • 20
0

You can't read the compressed data to string directly.What you have done in readCompressedData method is reading the compressed data to literal(which lead to a wrong string) and then get it's bytes(in method main).After you doing this,the contentInBytes is not really the bytes stored in the file.

When you try to make a string with bytes that can't be transformed into String. The bytes that represent the string is different.

For example:

        byte bytesBefore[] = {-1,-2,65,76,79,80};
        try {
            String str = new String(bytesBefore);
            byte bytesAfter[] = str.getBytes();
            System.out.println("str is " + str);
            System.out.println("after");
            for(Byte b : bytesAfter){
                System.out.print(" " + b);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }

OUTPUT:

str is ��ALOP
after
 -17 -65 -67 -17 -65 -67 65 76 79 80

Because bytes -1 and -2 here can't be transformed into string,when you new the string with bytesBefore,the bytes that stored in memory for str is bytesAfter,which change the -1 and -2 to -17 -65 -67 -17 -65 -67 .

Actually, the GZIPInputStream can be built with a FileInputStream,no need to get the bytes first.Just use the BufferedReader to read the GZIPInputStream which is built with a FileInputStream.

There is a solution:

import java.io.*;
import java.util.zip.GZIPInputStream;

public class GZipFile {
    public static void main(String[] args) throws Exception {
        BufferedReader reader = new BufferedReader(new InputStreamReader(
                new GZIPInputStream(new FileInputStream(
                        "comp_dump")), "UTF-8"));
        StringBuffer sb = new StringBuffer();
        String line;
        while ((line = reader.readLine()) != null) {
            sb.append(line).append("\r\n");
        }
        System.out.println(sb.toString());
    }
}

OUTPUT:

hello
world
tianwei
  • 1,859
  • 1
  • 15
  • 24
  • This solution works fine. I changed my code like given below, File file = new File("comp_dump"); byte[] bs = new byte[(int)file.length()]; FileInputStream fis = new FileInputStream(file); int len = fis.read(bs); I will give byte array bs as input to GZIPInputStream to decompress. But in this we give file as input. I want to post the compressed data to java. Then we have to read that data as bytes on java side. So how to read the posted compressed data as bytes in java and it can be stored as what type in mysql. Can It be stored as BLOB in mysql? – Tamil Arasu Aug 23 '16 at 09:10
  • Yes,BLOB is ok. Use GZIPInputStream and the BLOB's bytes as the input. – tianwei Aug 23 '16 at 11:10