0

I have used rawurlencode on a utf8 word. For example

$tit = 'தேனின் "வாசம்"';
$t = (rawurlencode($tit)); 

when I click the utf8 word ($t), I will be transferred to another page using .htaccess and I get the utf8 word using $_GET['word'];

The word displays as தேனினà¯_"வாசமà¯" not the actual word. How can I get the actual utf8 word. I have used the header charset=utf-8.

nithi
  • 3,725
  • 2
  • 20
  • 18

3 Answers3

0

Did you use rawurldecode($_GET['word']); ? And do you use UTF-8 encoding for your PHP file?

shybovycha
  • 11,556
  • 6
  • 52
  • 82
  • Ya I have used rawurldecode. If the word doesn't contain quotes it is displayed properly. I have issues only if the word contains quotes. I haved used in header – nithi Oct 12 '11 at 09:50
  • `magic_quotes` is off? Would be weird if it was still on in 2011. But you should check and do stripslashes _first_. – ontrack Oct 12 '11 at 09:57
  • @nithi not the header i was asking about but the encoding the script file was saved with (nice sentence hehe =) ). I created a simple two-files test (the first file sets rawurlencoded-GET-value and redirects to the second one) - it worked great as i saved both files as UTF-8. – shybovycha Oct 12 '11 at 09:58
  • @ontrack errr... what sense makes `magic_quotes` option or `stripslashes` function applied to this question??? – shybovycha Oct 12 '11 at 09:59
  • @shybovycha Because nithi says in the comment above mine: "If the word doesn't contain quotes it is displayed properly. I have issues only if the word contains quotes." – ontrack Oct 12 '11 at 10:08
  • @ontrack sorry, messed up with `short_tag` =P – shybovycha Oct 12 '11 at 10:13
  • @shybovycha and ontrack: I have used stripslashes and it works. Thank u – nithi Oct 12 '11 at 11:25
0

Was my comment first, but should have been an answer:

magic_quotes is off? Would be weird if it was still on in 2011. But you should check and do stripslashes.

ontrack
  • 2,963
  • 2
  • 15
  • 14
0
<?php
$s1 = <<<EOD
தேனினà¯_"வாசமà¯"
EOD;
$s2 = <<<EOD
தேனின் "வாசம்"
EOD;

$s1 = mb_convert_encoding($s1, "WINDOWS-1252", "UTF-8");

echo bin2hex($s1), "\n";
echo bin2hex($s2), "\n";
echo $s1, "\n", $s2, "\n";

Output:

e0aea4e0af87e0aea9e0aebfe0aea9e0af5f22e0aeb5e0aebee0ae9ae0aeaee0af22
e0aea4e0af87e0aea9e0aebfe0aea9e0af8d2022e0aeb5e0aebee0ae9ae0aeaee0af8d22
தேனின��_"வாசம��"
தேனின் "வாசம்"

You're probably just not showing the data as UTF-8 and you're showing it as ISO-8859-1 or similar.

Artefacto
  • 96,375
  • 17
  • 202
  • 225