17

I have configured the apache httpClient like so:

HttpProtocolParams.setContentCharset(httpParameters, "UTF-8");
HttpProtocolParams.setHttpElementCharset(httpParameters, "UTF-8");

I also include the http header "Content-Type: application/json; charset=UTF-8" for all http post and put requests.

I am trying to send http post/put requests with a json body that contains special characters (ie. chinese characters via the Google Pinyin keyboard, symbols, etc.) The characters appear as gibberish in the logs but I think this is because DDMS does not support UTF-8, as descibed in this issue.

The problem is when the server receives the request, it sometimes doesn't see the characters at all (especially the Chinese characters), or it becomes meaningless garbage when we retrieve it through a GET request.

I also tried putting 250 non-ascii characters in a single field because that particular field should be able to take up to 250 characters. However, it fails to validate at the server side which claims that the 250 character limit has been exceeded. 250 ASCII characters work just fine.

The server dudes claim that they support UTF-8. They even tried simulating a post request that contains Chinese characters, and the data was received by the server just fine. However, the guy (a Chinese guy) is using a Windows computer with the Chinese language pack installed (I think, because he can type Chinese characters on his keyboard).

I'm guessing that the charsets being used by the Android client and the server (made by Chinese guys btw) are not aligned. But I do not know which one is at fault since the server dudes claim that they support UTF-8, and our rest client is configured to support UTF-8.

This got me wondering on what charset Android uses by default on all text input, and if it can be changed to a different one programatically. I tried to find resources on how to do this on input widgets but I did not find anything useful.

Is there a way to set the charset for all input widgets in Android? Or maybe I missed something in the rest client configuration? Or maybe, just maybe, the server dudes are not using UTF-8 at their servers and used Windows charsets instead?

Makoto
  • 104,088
  • 27
  • 192
  • 230
avendael
  • 2,459
  • 5
  • 26
  • 30
  • 1
    Regarding the default character set for the [Java String class](http://docs.oracle.com/javase/7/docs/api/java/lang/String.html): "A String represents a string in the UTF-16 format ...". Also discussed [here](http://stackoverflow.com/questions/88838/how-to-convert-strings-to-and-from-utf8-byte-arrays-in-java) and [here](http://stackoverflow.com/questions/9075603/convert-utf-16-unicode-characters-to-utf-8-in-java). – JJD Aug 14 '12 at 14:38

4 Answers4

49

Apparently, I forgot to set the StringEntity's charset to UTF-8. These lines did the trick:

    httpPut.setEntity(new StringEntity(body, HTTP.UTF_8));
    httpPost.setEntity(new StringEntity(body, HTTP.UTF_8));

So, there are at least two levels to set the charset in the Android client when sending an http post with non-ascii characters.

  1. The rest client itself itself
  2. The StringEntity

UPDATE: As Samuel pointed out in the comments, the modern way to do it is to use a ContentType, like so:

    final StringEntity se = new StringEntity(body, ContentType.APPLICATION_JSON);
    httpPut.setEntity(se);
avendael
  • 2,459
  • 5
  • 26
  • 30
  • 1
    it doesn't become clear what is httpPut and how is it connected to the post – ılǝ Nov 11 '12 at 01:17
  • 1
    httpPut and httpPost are HttpPut and HttpPost instances from the apache http library of Android. http://developer.android.com/reference/org/apache/http/client/methods/package-summary.html – avendael Nov 11 '12 at 17:52
  • Let me rephrase my question. Are you using both HttpPut and HttpPost in your method? If yes - why, it seems redundant. – ılǝ Nov 12 '12 at 06:11
  • 1
    No I'm not. I'm simply illustrating that HttpPut and HttpPost objects should explicitly set the encoding of their string entities. Generally, all Http methods that send data to the server (put, patch, post, etc.) should explicitly set their StringEntity encoding to avoid the problems I experienced in my original question. – avendael Nov 17 '12 at 04:08
  • gotcha. Thanks! I thought that for some are using both methods for one operations and was confused. – ılǝ Nov 17 '12 at 10:45
  • 2
    You solution is absolutely correct, and saved me a lot of time, but it is now deprecated. I suggest a replacement, by setting the ContentType instead : i.e final StringEntity se = new StringEntity(query, ContentType.APPLICATION_JSON); – Samuel EUSTACHI Jan 02 '15 at 13:56
  • Thanks! You're right. This was written a long time ago. I'll edit my answer to reflect your suggestion. – avendael Jan 02 '15 at 16:55
  • 1
    After some days suffering, i found your anwer ! ContentType.APPLICATION_JSON works perfeclty for me ! Thanks a log @avendael – Eduardo Fabricio Jan 10 '17 at 19:09
13

I know this post is a bit old but nevertheless here is a solution:

Here is my code for posting UTF-8 strings (it doesn't matter if they are xml soap or json) to a server. I tried it with cyrillic, hash values and some other special characters and it works like a charm. It is a compilation of many solutions I found through the forums.

HttpParams httpParameters = new BasicHttpParams();
HttpProtocolParams.setContentCharset(httpParameters, HTTP.UTF_8);
HttpProtocolParams.setHttpElementCharset(httpParameters, HTTP.UTF_8);

HttpClient client = new DefaultHttpClient(httpParameters);
client.getParams().setParameter("http.protocol.version", HttpVersion.HTTP_1_1);
client.getParams().setParameter("http.socket.timeout", new Integer(2000));
client.getParams().setParameter("http.protocol.content-charset", HTTP.UTF_8);
httpParameters.setBooleanParameter("http.protocol.expect-continue", false);
HttpPost request = new HttpPost("http://www.server.com/some_script.php?sid=" + String.valueOf(Math.random()));
request.getParams().setParameter("http.socket.timeout", new Integer(5000));

List<NameValuePair> postParameters = new ArrayList<NameValuePair>();
// you get this later in php with $_POST['value_name']
postParameters.add(new BasicNameValuePair("value_name", "value_val"));

UrlEncodedFormEntity formEntity = new UrlEncodedFormEntity(postParameters, HTTP.UTF_8);
request.setEntity(formEntity);
HttpResponse response = client.execute(request);

in = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
StringBuffer sb = new StringBuffer("");
String line = "";
String lineSeparator = System.getProperty("line.separator");
while ((line = in.readLine()) != null) {
    sb.append(line);
    sb.append(lineSeparator);
}
in.close();
String result = sb.toString();

I hope that someone will find this code helpful. :)

JJD
  • 50,076
  • 60
  • 203
  • 339
5

You should set charset of your string entity to UTF-8:

StringEntity stringEntity = new StringEntity(urlParameters, HTTP.UTF_8);
savepopulation
  • 11,736
  • 4
  • 55
  • 80
2

You can eliminate the server as the problem by using curl to send the same data. If it works with curl use --trace to check the output.

Ensure you are sending the content body as bytes. Compare the HTTP request from Android with the output from the successful curl request.

Mullins
  • 2,304
  • 1
  • 19
  • 18