0

The length of the string Pépé is 6 characters in Chrome, but it is 4 characters in Safari. To determine this, I open up the consoles in both the browsers and enter the following code:

"Pépé".length

This difference is giving me trouble on the server side.

I am using jQuery.$ajax to send a POST request with data containing the string Pépé. When that data reaches the server, it is treating the values differently. I am able to retrieve the data when I am in Chrome but not when I am in Safari.

Inside of the ajax request, I am setting the parameter, contentType: application/json; charset=utf-8.

On the server side, it looks like Pépé when doing the POST request from Safari and Pépé when doing the POST request from Chrome.

Any clue why there is a difference between browsers?

1 Answers1

0

You are probably running into different "normalizations" as there are different ways those accents etc. could be mushed into UTF-8.

There is a really nice discussion here in the answers:

What is normalized UTF-8 all about?

That answer is in the PHP section of Stack Overflow, Java has similar ways of manipulating UTF-8 too. The browsers are probably going to send UTF-8 how they are going to send it. On the server side you probably need to just normalize all your data to NFD or NFC.

I would just force everything to NFC server side. If you are in Java something like this can do it:

http://docs.oracle.com/javase/6/docs/api/java/text/Normalizer.html

Edit: in all cases byte length and character length will depend on the normalization, as will strict comparisons -- regardless of programming language.

Community
  • 1
  • 1
DrLivingston
  • 788
  • 6
  • 15