13

In my website there is a form with a simple textarea for people to post comments. The problem is that sometimes I receive information in UTF-8 and sometimes in ISO. Is it possible to control that?

Maybe I am doing something wrong, but is it possible that the browser changes the codification of the data it sends?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Raul Leaño Martinet
  • 2,035
  • 6
  • 28
  • 44

5 Answers5

20

If you want to be sure of what character set you are accepting, set it in your form

<form method="post" action="/your/url/" accept-charset="UTF-8">
</form>

You can see all the acceptable character sets here: Character Sets

Jeremy B.
  • 9,168
  • 3
  • 45
  • 57
  • 1
    I believe it is the opposite that does not work. If I remember correctly IE will respect UTF-8, but if you set your charset to ISO it will fail. This does only count in new versions. You can also set the charset of the page as ayush mentioned to be sure as well. – Jeremy B. Feb 04 '11 at 20:25
4

You can always force UTF-8. Then you can send, receive, and store data in UTF-8 ad cover most human languages without having to change character set.

<meta http-equiv="Content-type" content="text/html; charset=utf-8"/>
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
ayush
  • 14,350
  • 11
  • 53
  • 100
2

But... check before encoding, if string is already UTF-8. Else you double-encode it.

function str_to_utf8 ($string) {

    if (mb_detect_encoding($string, 'UTF-8', true) === false) {
        $string = utf8_encode($string);
    }

    return $str;
}

Or use

$string = utf8_encode(utf8_decode($string));

So you do not double-encode a string.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
0

I solved this problem by changing mbstring.http_input = pass in my php.ini file

-3

You could encode the $_POST data into UTF-8 using PHP's utf8_encode function.

Something like:

$_POST['comments'] = utf8_encode( $_POST['comments'] );
voidstate
  • 7,937
  • 4
  • 40
  • 52
  • 1
    this would double-encode content that has been sent as utf-8 already. – cweiske Nov 22 '12 at 08:46
  • Urgh. True. If you want to use PHP to encode, then this (http://stackoverflow.com/questions/910793/detect-encoding-and-make-everything-utf-8) might help... although detecting encoding is very tricky. – voidstate Nov 23 '12 at 10:04