2

I want to get html from a URL having some Arabic like

http://www.example.com/2013/07/31/الاختبار.html

using php. I tried with

file_get_html("http://www.example.com/2013/07/31/الاختبار.html")

but it is giving the following error

Warning: file_get_contents(http://www.example.com/2013/07/31/الاختبار.html) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.0 404 Not Found in filename.php

Please help.

http://www.example.com/2013/07/31/الاختبار.html

is for reference only, doesn't exist.

laalto
  • 150,114
  • 66
  • 286
  • 303
Mobi
  • 645
  • 3
  • 6
  • 14

1 Answers1

5

URLs can't contain non-ASCII characters.

Where it seems that they do, it's in fact the browser silently converting your characters into URLescaped ones in the background.

When you paste this URL into your browser:

http://www.example.com/2013/07/31/الاختبار.html

will in reality look like this:

http://www.example.com/2013/07/31/%D8%A7%D9%84%D8%A7%D8%AE%D8%AA%D8%A8%D8%A7%D8%B1.html

PHP doesn't have this ability to silently convert characters; you'll have to do it manually. To do that, run PHP's urlencode() over the URL before making the call.

Community
  • 1
  • 1
Pekka
  • 442,112
  • 142
  • 972
  • 1,088
  • Thanks,I tried to open URL using urlencode() but its still giving error `[function.file-get-contents]: failed to open stream: No error in filename.php` – Mobi Jul 31 '13 at 07:15
  • Can you edit the full code of your request into your question, including the `urlencode()` call? – Pekka Jul 31 '13 at 07:22
  • `$url = "http://www.example.com/2013/07/31/الاختبار.html"; $url = urlencode($url); $html1 = file_get_contents($url);` – Mobi Jul 31 '13 at 07:25
  • 1
    @Mobi you need to limit the `urlencode` to the Arabic script, otherwise the `://` will get encoded as well – Pekka Jul 31 '13 at 12:05