2

So I've got a problem here that I'm finding it difficult to crack.

I have a method here that sends a String request to a url and reads back the response. This works completely fine normally. But now I'm receiving response containing UTF-8 encoded characters, which I'm not able to read properly.

Request:

<Request>
  <requestId>1071977</requestId>
  <datas>
    <parameter>
      <id>CATEGORY</id>
      <value>ALL</value>
    </parameter>
  </datas>
</Request>

Response(the one I'm facing issue with):

<Response>
   <ResponseId>1071977</ResponseId>
      <datas>
         <parameter>
            <id>CATEGORY</id>
            <value>ALL</value>
         </parameter>
         <parameter>
            <id>MSG</id>
            <value>رنت ما</value>
         </parameter>
      </datas>
</Response>
public static String Post(String urlString, String request) throws Exception {
        String response = null;
        OutputStreamWriter out = null;
        InputStream in = null;
        URL url = null;
        URLConnection connection = null;
        StringBuilder sb = null;
        try {
            url = new URL(urlString);
            connection = url.openConnection();
            connection.setReadTimeout(60000);
            connection.setDoOutput(true);
            connection.setDoInput(true);
            out = new OutputStreamWriter(connection.getOutputStream());
            out.write(request);
            out.flush();
            out.close();
            out = null;
            int i = -1;
            in = connection.getInputStream();
            sb = new StringBuilder();
            while ((i = in.read()) != -1) {
                sb.append((char) i);
            }
            response = sb.toString();
            in.close();
            in = null;
        } finally {
            sb = null;
            connection = null;
            url = null;
        }
        return response;
}

I know I can use something like

ByteBuffer bb = StandardCharsets.UTF_8.encode(utfstring);
String normalString = StandardCharsets.UTF_8.decode(bb).toString();

to read utf-8 strings in java, but I'm not sure how to do the same while reading the response from URLConnection class. Would appreciate some help. Thanks.

seenukarthi
  • 8,241
  • 10
  • 47
  • 68
Phillip
  • 437
  • 1
  • 5
  • 16
  • I hope you don't think your example with `StandardCharsets.UTF_8.encode/decode` changes the encoding of a `String` because it doesn't. It's basically the same as doing `x += 1; x -=1;`, i.e. a no op. – Kayaman Sep 05 '19 at 07:19
  • 1
    How about wrapping the reading into a new BufferedReader(new InputStreamReader(url.openStream(), "UTF-8")? That should solve your problem – ItFreak Sep 05 '19 at 07:21
  • UTF-8 is a character encoding, not a character set. So, "UTF-8 character" does not make sense. That thinking led you astray. See @f1sh's [answer](https://stackoverflow.com/a/57800277/2226988). – Tom Blodget Sep 05 '19 at 22:33

1 Answers1

1

By read()ing every byte individually and treating it as a char, you break multi-byte characters.

I would read the whole InputStream into a byte[] (here are examples how to do that) and then create the String from it using

new String(yourByteArray, "UTF-8");

Alternatively, you can keep your loop, but don't append chars but fill up your own byte[].

f1sh
  • 11,489
  • 3
  • 25
  • 51