1

Here the data what i entered in my textbox in form. Text box name:quiz_optionA

value  = ÉÉÉabcd.

I get the data from my php function in below way

$this->_data = JRequest::get('post');
$string = $this->_data['quiz_optionA'];

below method I've used convert french into english

$normalizeChars = array(
 'Š'=>'S', 'š'=>'s', 'Ð'=>'Dj','Ž'=>'Z', 'ž'=>'z', 'À'=>'A', 'Á'=>'A', 'Â'=>'A',      'Ã'=>'A', 'Ä'=>'A',
'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E', 'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss','à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a',
'å'=>'a', 'æ'=>'a', 'ç'=>'c', 'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i',
'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o', 'ö'=>'o', 'ø'=>'o', 'ù'=>'u',
'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y', 'ƒ'=>'f'
);


echo strtr($string, $normalizeChars);die;

Output:

A�A�A�abcd

Normal english alphabets converted to string. But the french characters didn't convert as a string.

The output should be EEEabcd. Could you help me to do this?

Mat
  • 202,337
  • 40
  • 393
  • 406
ram
  • 593
  • 1
  • 8
  • 18
  • Is your PHP file saved in the same encoding that your browser displays it in? Do you have any headers specifying the encoding? – Jon Apr 09 '12 at 07:42
  • you have to use multibyte string functions http://stackoverflow.com/questions/9986584/dealing-with-non-ascii-string-as-array-and-character – max Apr 09 '12 at 07:42
  • My editor worked with "cp1252" character encoding.It show me "Some characters cannot be mapped using "cp1252" character encoding.Either change the encoding or remove the characters which are not supported by the "cp1252" character encoding"..If i saved as utf8,It worked well.Any other way to convert a character as a utf8 format in php thorough coding ? – ram Apr 09 '12 at 14:06

2 Answers2

0

Today I have been answered on similar question So try to use html code like this:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

And ensure that your .php file which contain $normalizeChars has utf8 encoding.

Community
  • 1
  • 1
Valeriy Gorbatikov
  • 3,459
  • 1
  • 15
  • 9
0

Your line

echo strtr($string, $normalizeChars);

will only convert the characters you've specified in $normalizeChars. The ones you miss to translate, namely É (Note: you left the encoding of that character undefined in your question), don't have any translation information in $normalizeChars.

If you want those characters to translate as well, you need to add them to the $normalizeChars array. It looks like the É is in fact A� (if you add a hexdump, we can better say what this is).

I'd assume the following:

The browser sends the input to your application in UTF-8 encoding. You process them in some single-byte encoding (non-utf-8), that's why it does not change.

Edit:

É; cp1252 #201; LATIN CAPITAL LETTER E WITH ACUTE; U+00C9

That is UTF-8 encoded within a PHP string: "\xC3\x89". To encode nearly any character into UTF-8, you first need to find your character in your encoding and it's unicode codepoint. With your example:

Character: É
Codepoint: LATIN CAPITAL LETTER E WITH ACUTE (U+00C9)

The codepoint can be converted to UTF-8 with a small PHP function:

/**
 * @see Unicode 6.0.0 Ch2 General Structure, rfc3629
 * @param int|string $codepoint e.g. 0xC9 / "U+00C9"
 * @return string
 */
function unicodeCodePointToUTF8($codepoint)
{
    is_string($codepoint) && sscanf($codepoint, 'U+%x', $codepoint);
    if ($codepoint < 0) {
        throw new InvalidArgumentException('Lower than 0x00.');
    }
    if ($codepoint > 0x10FFFD) {
        throw new InvalidArgumentException('Larger than 0x10FFFD.');
    }
    if (0xD800 <= $codepoint && $codepoint <= 0xDFFF) {
        throw new InvalidArgumentException(sprintf('High and low surrogate halves are invalid unicode codepoints (U+D800 through U+DFFF, is U+%04X).', $codepoint));
    }
    if ($codepoint <= 0x7F) {
        return chr($codepoint);
    }
    if ($codepoint <= 0x7FF) {
        return chr(0xC0 | $codepoint >> 6 & 0x1F) . chr(0x80 | $codepoint & 0x3F);
    }
    if ($codepoint <= 0xFFFF) {
        return chr(0xE0 | $codepoint >> 12 & 0xF) . chr(0x80 | $codepoint >> 6 & 0x3F) . chr(0x80 | $codepoint & 0x3F);
    }
    return chr(0xF0 | $codepoint >> 18 & 0x7) . chr(0x80 | $codepoint >> 12 & 0x3F) . chr(0x80 | $codepoint >> 6 & 0x3F) . chr(0x80 | $codepoint & 0x3F);
}

Usage:

echo bin2hex(unicodeCodePointToUTF8(0x00C9)), "\n"; # c389

The hexadecimal output can be written in string form in PHP by prefixing it with \x in a double-quoted string:

$binary = "\xC3\x89";

That way of writing is immune to the encoding of the actual PHP file.

Community
  • 1
  • 1
hakre
  • 193,403
  • 52
  • 435
  • 836
  • My editor worked with "cp1252" character encoding.It show me "Some characters cannot be mapped using "cp1252" character encoding.Either change the encoding or remove the characters which are not supported by the "cp1252" character encoding"..If i saved as utf8,It worked well.Any other way to convert a character as a utf8 format in php thorough coding ? – ram Apr 09 '12 at 14:06
  • @ram: I extended the answer. To valid if something is valid UTF-8, I cross link a question: [Fast way to strip all characters not displayable in browser from utf8 string](http://stackoverflow.com/a/7635283/367456) (You might not need this). – hakre Apr 10 '12 at 14:24