2

I am getting an arabic translation using google, this is my code:

header('Content-Type: text/html; charset=UTF-8');
$page=file_get_contents("http://www.google.com/translate_t?langpair=en|ar&text=hello",FILE_TEXT);
$page=substr($page,strpos($page,"TRANSLATED_TEXT")+strlen("TRANSLATED_TEXT")+2);
$page=substr($page,0,strpos($page,"';INPUT_TOOL_PATH"));
echo mb_detect_encoding($page); // edited 2015/05/26
echo mb_convert_encoding($page, 'UTF-8', 'ISO-8859-6');

If you follow the link on the file_get_contents function, you will see this word: مرحبا

But if you runs the code you will get: كرحبا

As you can see, the last (or first) character is different!

What I'm doing wrong?

stramin
  • 2,183
  • 3
  • 29
  • 58
  • 1
    You probably do not need to change the charset of arabic characters. You need to set proper charset/encoding in the page that displays the character. Using `UTF-8` will work. This should be helpful http://stackoverflow.com/questions/279170/utf-8-all-the-way-through – Ejaz May 17 '15 at 21:35
  • The page is google, i can't modify it, I also think file_get_contents uses his own charset. – stramin May 17 '15 at 22:01
  • BTW why are you not using google translate REST API https://cloud.google.com/translate/v2/using_rest? – Ejaz May 17 '15 at 22:17
  • Its a paid service, I want to translate "hello" only :) – stramin May 18 '15 at 01:39
  • what do you get as encoding? if you check using: echo mb_detect_encoding($str); – Armand May 23 '15 at 19:27
  • its says UTF-8, i going to edit the main post to add the mb_detect_encoding – stramin May 26 '15 at 03:43

1 Answers1

3

Replace the last line with:

echo iconv('WINDOWS-1256', 'UTF-8', $page);

And I think it because you're using the wrong encoding, if you check the content charset meta returned by the page you'll see that it is windows-1256.

wesamly
  • 1,484
  • 1
  • 16
  • 23