2

I'm having strange issue with a response got from API. I'm using apache HTTP Client to get response. Response header has the following

Content-Type=[application/json; charset=utf-16]
Transfer-Encoding=[chunked]
X-Powered-By=[ASP.NET] // Yes, people using ASP.NET

So based on this, when I get response, my response looks like follows

笀∀匀琀愀琀甀猀䌀漀搀攀∀㨀㈀ 

So I tried the following.

String body = "笀∀匀琀愀琀甀猀䌀漀搀攀∀㨀㈀";
String charSetString = "utf-8|utf-16|utf-16le, all possible combination"
body = new String(body.getBytes(Charset.forName(charSetString));
body = body.replaceAll("[^\\x00-\\x7F]", "");

But no luck. Started to look at first char. Actual response in first char is { I converted first char from response to ascii

(int)body.charAt(0) 

Value is 31488; Whereas Ascii value of { is 123; if I do 31488/256 = 123 and converting this to char giving me { so I did the following

String encoded = "";
for(int i=0; i< body.length(); i++) {
    encoded += ((char) ((int)body.charAt(i) / 256 ));
}

And it worked. But this is so bad conversation for single API. What I'm missing, what exactly the charset of response if I get 31488 for {

Update

My API call code.

import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.charset.Charset;
import java.util.LinkedHashMap;
import java.util.Map;
import org.apache.http.impl.client.HttpClientBuilder;
import org.springframework.http.HttpEntity;
import org.springframework.http.HttpHeaders;
import org.springframework.http.HttpMethod;
import org.springframework.http.ResponseEntity;
import org.springframework.http.client.HttpComponentsClientHttpRequestFactory;
import org.springframework.util.SerializationUtils;
import org.springframework.web.client.RestTemplate;


public class HTTPClientManager {
    RestTemplate restTemplate = null;

    public void setup() {

        HttpComponentsClientHttpRequestFactory clientHttpRequestFactory = null;
            clientHttpRequestFactory = new HttpComponentsClientHttpRequestFactory(
                    HttpClientBuilder.create().build());
        clientHttpRequestFactory.setReadTimeout(5 * 1000);
        clientHttpRequestFactory.setConnectTimeout(5 * 1000);
        restTemplate = new RestTemplate(clientHttpRequestFactory);
    }

    public static void main(String...strings) throws FileNotFoundException, IOException {
        HTTPClientManager ht = new HTTPClientManager();
        ht.setup();
        Map<String, Object> properties = new LinkedHashMap<>();
        properties.put(Const.METHOD, "GET");
        properties.put(Const.URL, strings[0]);
        properties.put(Const.CHAR_SET, "UTF-16LE");

        Map<String, Object> ob = ht.getResponse(properties);
        try {
            String res = ob.get(Const.RESPONSE).toString();
            System.out.println("Response ->>>>>>>>> \n " + res);
        }catch(Exception e) {
            e.printStackTrace();
        }

       try (FileOutputStream fos = new FileOutputStream("response")) {
            fos.write(SerializationUtils.serialize(ob));
       }
    }

    public static class Const {
        public static final String REQUEST = "request";
        public static final String URL = "url";
        public static final String CHAR_SET = "charSet";
        public static final String RESPONSE = "response";
        public static final String METHOD = "method";
        public static final String REQUEST_HEADER = "reqHeader";
        public static final String RESPONSE_HEADER = "resHeader";
    }



    public Map<String, Object> getResponse(Map<String, Object> properties) {
        HttpHeaders headers = new HttpHeaders();
        HttpEntity requestEntity = null;
        Map<String, Object> responseReturn = new LinkedHashMap<>();
        HttpMethod method = null;

        if (properties.get(Const.METHOD).toString().equals("GET")) {
            method = HttpMethod.GET;
            requestEntity = new HttpEntity<String>("", headers);
        } else if (properties.get(Const.METHOD).toString().equals("POST")) {
            method = HttpMethod.POST;
            requestEntity = new HttpEntity<String>(properties.get(Const.REQUEST).toString(), headers);
        }else if (properties.get(Const.METHOD).toString().equals("PUT")) {
            method = HttpMethod.PUT;
            requestEntity = new HttpEntity<String>(properties.get(Const.REQUEST).toString(), headers);
        }else if (properties.get(Const.METHOD).toString().equals("DELETE")) {
            method = HttpMethod.DELETE;
            requestEntity = new HttpEntity<String>(properties.get(Const.REQUEST).toString(), headers);
        }
        ResponseEntity<String> response = null;
        try {
            response = restTemplate.exchange(properties.get(Const.URL).toString(), method, requestEntity, String.class);
            String body = response.getBody();
            if(properties.get(Const.CHAR_SET) != null) {
                try {
                body = new String(body.getBytes(Charset.forName(properties.get(Const.CHAR_SET).toString())));
                body = body.replaceAll("[^\\x00-\\x7F]", "");
                }catch(Exception e) {
                    e.printStackTrace();
                }
            }
            responseReturn.put(Const.RESPONSE, body!=null?body:"");
            responseReturn.put(Const.RESPONSE_HEADER, response.getHeaders());
        } catch (org.springframework.web.client.HttpClientErrorException |org.springframework.web.client.HttpServerErrorException  exception) {
            exception.printStackTrace();
        }catch(org.springframework.web.client.ResourceAccessException exception){
            exception.printStackTrace();
        }catch(Exception exception){
            exception.printStackTrace();
        }

        return responseReturn;
    }

}
Pasupathi Rajamanickam
  • 1,982
  • 1
  • 24
  • 48
  • 1
    Start by reading the description of UTF-16 on Wikipedia to understand what is happening. Also, read this https://stackoverflow.com/q/2241348/18157 – Jim Garrison Jun 23 '18 at 01:06
  • The problem is likely in the way you are using HttpClient. Please [edit] to show that code. You shouldn't get to the point of having a String value unless it is the correct text. String is not a general data type. – Tom Blodget Jun 23 '18 at 23:48
  • @TomBlodget Updated code. – Pasupathi Rajamanickam Jun 24 '18 at 16:05
  • The line `body = new String(body.getBytes(Charset.forName(charSetString));` is complete nonsense. A byte array can have a character set; a Java `String` instance cannot. It offers a character based API to the outside. How it stores the characters internally, is implementation dependent. It has even changed with Java 10. So this line does not convert the encoding; instead it damages it. `ResponseEntity` will take care of the encoding. If the result is not correct, then the HTTP response was invalid in the first place and needs to be fixed server-side. – Codo Jun 24 '18 at 19:00
  • @Codo OK, I can remove the nonsense code. I just told here what I've tried. You're saying `ResponseEntity` should take care, why it didn't? (if you could tell the possible reasons) Possibly bug in `ResponseEntity`? I don't agree this `If the result is not correct, then the HTTP response was invalid in the first place and needs to be fixed server-side` Because, chrome browser giving nice response string. Soap UI does. Soap UI using same httpclient. – Pasupathi Rajamanickam Jun 24 '18 at 21:28

1 Answers1

1

I think that your problem is that you make a wrong assumption that your reply comes in UTF-16 i.e your line Content-Type=[application/json; charset=utf-16] is wrong. Try to remove the charset part (Content-Type=[application/json]) or set it to UTF-8 (Content-Type=[application/json; charset=utf-8]) and see what happens. I believe that reply that you are getting is: {"StatusCode":2. Not sure why the reply is seemingly missing '}' at the end, but other then that it makes sense. BTW I managed to interpret your reply by converting your reply string to unicode sequence. That gave me the sequence: \u7b00\u2200\u5300\u7400\u6100\u7400\u7500\u7300\u4300\u6f00\u6400\u6500\u2200\u3a00\u3200. This gave the idea that by forcing to interpret the response as utf-16 you messed up the content. So if I changed the sequence to \u007b\u0022\u0053\u0074\u0061\u0074\u0075\u0073\u0043\u006f\u0064\u0065\u0022\u003a\u0032 and converted it back to String from unicodes I got {"StatusCode":2.

BTW If you're interested in tool to convert any string to unicode sequence and vise-versa then you can use MgntUtils open source library (written by me). All I had to do to convert your response string is:

String result = "笀∀匀琀愀琀甀猀䌀漀搀攀∀㨀㈀";
        result = StringUnicodeEncoderDecoder.encodeStringToUnicodeSequence(result);
        System.out.println(result);

Here is the link to the article that describes the utilities in the library and where to get it (Available both on github and Maven central)

In the article look for paragraph "String Unicode converter" for explanation of this feature. The library also contains a simple Http client feature (not described in the article but described in its javadoc.

Michael Gantman
  • 7,315
  • 2
  • 19
  • 36
  • I do not have control over API server-side. So can't change `content-type` `Not sure why the reply is seemingly missing '}'` because It's pretty big JSON, I just randomly copied first few char. But you got the conversion right. I would love to look at your library. – Pasupathi Rajamanickam Jun 24 '18 at 21:37
  • You don't have the control of server side - correct. but when you send your request to your server side you set the content type that you expect in your request header. That's what I meant for you to change. I believe that your server side responds in UTF-8, but you send the request telling to receive and interpret your response as UTF-16, and I believe that this is your problem. Change your request header as I recommended in my answer and see if it helps – Michael Gantman Jun 24 '18 at 21:41
  • Oh, sorry, in your code you have a line "properties.put(Const.CHAR_SET, "UTF-16LE");" so either remove it or change it to "UTF-8" – Michael Gantman Jun 24 '18 at 21:43
  • I think my server API supper messed. I tried what you're saying by sending `accept: application/json; charset=utf-8` server again sending response header as `Content-Type=[application/json; charset=utf-16]` The String I posted is without this line `"properties.put(Const.CHAR_SET, "UTF-16LE");"` – Pasupathi Rajamanickam Jun 24 '18 at 21:44
  • I agree to your point `I believe that your server side responds in UTF-8, but you send the request telling to receive and interpret your response as UTF-16` server saying it's UTF-16 but sending UTF-8. But I'm wondering, how this this been taken care in Chrome browser or soap UI. Using same httpclient. – Pasupathi Rajamanickam Jun 24 '18 at 21:45
  • Well, you can try HttpClient from my library... :). But seriously, in your chrome browser download and install ARC app (Advanced REST client) just google "ARC" or "Advanced REST client". with it you can send HTTP requests to your server easily changing the headers as you like. You will be able to play with your content type and find out which one is correct. – Michael Gantman Jun 24 '18 at 21:50
  • It's working fine with `Advanced REST client` but I'm building API test framework. so I should have the encoding capability. :( – Pasupathi Rajamanickam Jun 24 '18 at 22:24
  • 1
    I did not propose ARC as a solution but rather as a testing and diagnostic tool. If it works with ARC then look what request header "Content-Type" content is, and that is what you need send in your code – Michael Gantman Jun 25 '18 at 08:18