11

My application makes a http request to some api service, that service returns a gzipped response. How can I make sure that the response is indeed in gzip format? I'm confused at why after making the request I didn't have to decompress it.

Below is my code:

public static String streamToString(InputStream stream) {
    BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
    StringBuilder sb = new StringBuilder();
    String line;

    try {
        while ((line = reader.readLine()) != null) {
            sb.append(line).append("\n");
        }
    } catch (IOException e) {
        logger.error("Error while streaming to string: {}", e);
    } finally {
        try { stream.close(); } catch (IOException e) { }
    }

    return sb.toString();
}

public static String getResultFromHttpRequest(String url) throws IOException { // add retries, catch all exceptions
    HttpClient httpclient = new DefaultHttpClient();
    HttpGet httpGet;
    HttpResponse httpResponse;
    InputStream stream;

    try {
        httpGet = new HttpGet(url);
        httpGet.setHeader("Content-Encoding", "gzip, deflate");
        httpResponse = httpclient.execute(httpGet);
        logger.info(httpResponse.getEntity().getContentEncoding());
        logger.info(httpResponse.getEntity().getContent());
        if (httpResponse.getStatusLine().getStatusCode() == 200) {
            stream = httpResponse.getEntity().getContent();
            return streamToString(stream);
        }
    } catch (IllegalStateException e) {
        logger.error("Error while trying to access: " + url, e);
    }

    return "";
}

Maybe it is decompressing it automatically, but I would like to see some indication of that at least.

iCodeLikeImDrunk
  • 17,085
  • 35
  • 108
  • 169
  • possible duplicate of [Does Apache Commons HttpClient support GZIP?](http://stackoverflow.com/questions/2777076/does-apache-commons-httpclient-support-gzip) – jtahlborn Jan 31 '14 at 14:55
  • @jtahlborn kinda but not exactly. – iCodeLikeImDrunk Jan 31 '14 at 14:59
  • If the environment allows you to do so, a very quick check could have been capturing the traffic (Wireshark, Tcpdump...) between the app and the server. As HTTP is a text based protocol, if the response has the right header and the body is composed mostly of non-readable characters, it looks like the response is compressed. – Francisco Carriedo Scher Apr 21 '16 at 20:53

4 Answers4

18

Hi I am late but this answer might by used who is facing same issue. By default content is decompressed in the response. So, you have to disable the default compression using following code:

CloseableHttpClient client = HttpClients.custom()
    .disableContentCompression()
    .build();

HttpGet request = new HttpGet(urlSring);
request.setHeader(HttpHeaders.ACCEPT_ENCODING, "gzip");

CloseableHttpResponse response = client.execute(request, context);
HttpEntity entity = response.getEntity();
Header contentEncodingHeader = entity.getContentEncoding();

if (contentEncodingHeader != null) {
    HeaderElement[] encodings =contentEncodingHeader.getElements();
    for (int i = 0; i < encodings.length; i++) {
        if (encodings[i].getName().equalsIgnoreCase("gzip")) {
            entity = new GzipDecompressingEntity(entity);
            break;
        }
    }
}

String output = EntityUtils.toString(entity, Charset.forName("UTF-8").name());
Alex Ciminian
  • 11,398
  • 15
  • 60
  • 94
Prabhat Kumar
  • 256
  • 2
  • 9
2

I think you want to use DecompressingHttpClient (or the new HttpClientBuilder - which adds that header by default, don't call disableContentCompression - I don't think DefaultHttpClient supports compression by default). The client needs to send an Accept-Encoding header, Content-Encoding comes from the server response.

Elliott Frisch
  • 198,278
  • 20
  • 158
  • 249
2
httpResponse.getEntity().getContentEncoding()

You can find out whether or not an entity requires decompression by examining its Content-Encoding header. This header will be rewritten (or removed) in case of automatic content decompression.

ok2c
  • 26,450
  • 5
  • 63
  • 71
  • when i get the response, i did a sysout for the contentencoding, but it is coming out as empty string, does that mean it was automatic? – iCodeLikeImDrunk Jan 31 '14 at 16:23
  • @yaojiang: absence of explicit 'Content-Encoding' header implies identity encoding, that is, no encoding – ok2c Jan 31 '14 at 16:28
  • when i enter the api url on firefox, i do see that the response headers contain "content-encoding = gzip", "content-length = 278", "content-type = application/json, charset=utf-8", "vary = Accept-Encoding" – iCodeLikeImDrunk Jan 31 '14 at 16:31
  • @yaojiang: So what? You can disable automatic content decompression or execute with wire logging on to see raw HTTP message composition – ok2c Jan 31 '14 at 16:37
  • 1
    @oleg so what, what? What he means is that the response contains a content-encoding set to gzip but the http response object does not display it. I'm running into that issues as well. – Edy Bourne Aug 06 '14 at 19:29
1

Since 4.1, Apache HttpClients handles request and response compression. You can check the example in another answer here.

Still in case you want to check whether the response was compressed or not. You can print the class of the entity.

HttpResponse httpResponse = client.execute(request);
HttpEntity httpEntity = httpResponse.getEntity();
System.out.println(httpEntity.getClass().getName());

In case of gzip the output will be org.apache.http.client.entity.GzipDecompressingEntity & for deflate its org.apache.http.client.entity.DecompressingEntity

Garry
  • 678
  • 1
  • 9
  • 21