3

I am facing the problem similar to How to Force a jar to uses(or the jvm runs in) utf-8 instead of the system's default encoding. There is a server and client java applications. If I run both of them from Eclipse then everything works just fine. If I make jars then a String exchanged between them gets spoiled (wrong encoding).

If I run both images with -Dfile.enconding=utf-8 JVM parameter then it works okay. But since the link above says that it is not the best solution (at least requires running jar from bat) I have tried to solve the issue with specifying encoding to BufferedReader. But it fails and with jar it is difficult to debug.

This code is for sending request and getting one line in JSON format as reply. The reply is proved to have UTF-8 encoding.

public static String sendRequest (String request) {
    if (request == null) return null;

    try {
        URL url = new URL(request);
        HttpsURLConnection con = (HttpsURLConnection)url.openConnection();
        BufferedReader inReader = new BufferedReader(new InputStreamReader(con.getInputStream(), Charset.forName("UTF-8")));
        String line = inReader.readLine();
        inReader.close();
        return line;
    } catch (Exception e) {
        e.printStackTrace(System.err);
    }

    return null;
}

This is how the line look like

{"response":[{"uid":123456,"first_name":"Имя","last_name":"Фамилия"}]}

Then I prepare it to use in Gson.fromJson()

int beginIndex = reply.indexOf('[');
int endIndex = reply.indexOf(']');
reply = reply.substring(beginIndex + 1, endIndex);
SocialPerson vkPerson = new Gson().fromJson(reply, SocialPerson.class);

After that the String is being sent to server using Netty's ChannelBuffer generated with ChannelBuffers.wrappedBuffer() and NettyUtils.writeStrings()

I try to debug Client in Eclipse and Server running from jar, then Eclipse shows that until the string is really given to framework to deliver it looks valid.

Then I debug Server and Client runs from Jar and once string being received it already looks like rubbish.

At server side

    private final String username;
    private final String password;

    public SimpleCredentials(ChannelBuffer buffer)
    {
        this.username = NettyUtils.readString(buffer);
        this.password = NettyUtils.readString(buffer);
    }

Where do you think the problem can be? Sorry, I can not post all the code here.

UPD: username is generated from firstName and lastName

ChannelBuffer buffer = ChannelBuffers.wrappedBuffer(opCode, NettyUtils.writeStrings(userId, userName, refKey));
Community
  • 1
  • 1
Nikolay Kuznetsov
  • 9,467
  • 12
  • 55
  • 101

2 Answers2

3

When you read the network stream you need to reencode your strings manually if the automatically way fails. It is possible that the libiary which you are using is ignoring the content encoding or maybe it is missing in the HTTP-response.

Somewhere in your code will be a byte array which you can convert in the String constructor:

String xxx = new String(bytes, "utf-8");

If you get the String with the wrong encoding you can check this code:

String rightEncoded = new String(wrongEncodedString.getBytes("Cp1252"), "utf-8");
rekire
  • 47,260
  • 30
  • 167
  • 264
  • Well, I already get it as Strings from framework. Should I get bytes from String and construct a new String with specified encoding? Btw, the downvote was not from me :) – Nikolay Kuznetsov Nov 28 '12 at 06:37
  • Can I get to know the encoding of the String I get? I am not sure it is cp1252 – Nikolay Kuznetsov Nov 28 '12 at 06:46
  • On a ninja googling I found that this encoding maybe the default encoding. If this doesn't work try some other like us-ascii or so. – rekire Nov 28 '12 at 06:47
  • [Default encoding in Java](http://stackoverflow.com/questions/1749064/how-to-find-default-charset-encoding-in-java) this sounds as question worth studying for me – Nikolay Kuznetsov Nov 28 '12 at 06:55
  • Maybe also a solution to force utf-8 to set as the default encoding: `System.setProperty("file.encoding", "utf-8");` – rekire Nov 28 '12 at 06:56
  • 2
    Setting a system property programmatically will affect all code running within the same JVM, which is hazardous, especially when discussing such a low-level system property. – Isaac Nov 28 '12 at 07:01
  • `System.out.println(name.getBytes().length + " : " + Arrays.toString(name.getBytes()));` gives me same result byte to byte before sending and at reception. – Nikolay Kuznetsov Nov 28 '12 at 10:18
  • That is perfect normal you need to set the codepages. See again my line above. My variable `wrongEncodedString` is in your case `name`. – rekire Nov 28 '12 at 10:34
  • I have tried that line (cp1251) in my case. I have applied it to server and it worked when both images run as jar. Then I have changed client to Eclipse and it started to fail. Because in this situation client sends UTF-8 and server tries to decode it as if it cp1251. – Nikolay Kuznetsov Nov 28 '12 at 11:49
  • @Isaac, > Setting a system property programmatically will affect all code >running within the same JVM, which is hazardous In my System it does not happen. I use `System.getProperty("file.encoding")` and it gives me cp1252 at startup even though before it was set to utf-8 by another instance of application. Who resets the file.encoding back to original value? – Nikolay Kuznetsov Nov 28 '12 at 15:34
1

You shouldn't be using the file.encoding system property.

The best way to avoid such encoding issues is to never assume anything about default platform encodings and always provide an encoding when constructing readers or when converting bytes to Strings and vice versa.

Your sendRequest method seems to be OK with respect to handling encodings: it reads characters from the input, explicitly mentioning that it expects the stream to be encoded in UTF-8.

However, we can't see the other end of the client/server sequence. Quoting you:

After that the String is being sent to server using Netty's ChannelBuffer generated with ChannelBuffers.wrappedBuffer() and NettyUtils.writeStrings()

You also mentioned that you can't attach the entire code here, which is understandable; therefore, I'd advise that you look into how exactly you're sending those strings out, and whether or not you're explicitly specifying an encoding while doing so.

EDIT as per OP's update: well, I'm sorry that I am not familiar with Netty, but still I'll take a shot here. Doesn't NettyUtils.writeStrings(), or any of the code that calls it, accept a character encoding? I can't find the JavaDoc to any NettyUtils online. Work with me here. :-)

Isaac
  • 16,458
  • 5
  • 57
  • 81
  • [NettyUtils.java](https://github.com/menacher/java-game-server/blob/master/jetserver/src/main/java/org/menacheri/jetserver/util/NettyUtils.java) – Nikolay Kuznetsov Nov 28 '12 at 06:51
  • I has appeared to be not Netty thing but a class from person author of framework, so I might need to contact him as well – Nikolay Kuznetsov Nov 28 '12 at 06:53
  • I agree. I looked at the code for `NettyUtils` and it ends up calling `StringEncoderWrapper` to actually do the serialization. I'm sure that it (or some code that ends up being called *by* it) ends up making assumptions about the default platform encoding. – Isaac Nov 28 '12 at 06:58