1

I have simple question. User supplies URL to my PHP script where I fetch the page from the URL and parse it and show some snippet to user. Now I want to sanitize or better escape the URL so it is safe for me to fetch it by using file_get_contents().

My simplified code looks like this:

$url = $_POST['url'];
$html = file_get_contents($url);

First thing what came to my mind is to use regex for catching evil URL, but I don't think it is efficient and better would be escape the whole URL. But what PHP function can I use for escaping URL for use in file_get_contents() function ?

Frodik
  • 14,986
  • 23
  • 90
  • 141

2 Answers2

2

You could simply require the url to start with http:// or https://.

Luckily PHP is smart enough not to follow redirects to a file:// url.
However, it does follow redirects to ftp:// urls, so you better make sure your server cannot access any internal ftp servers without authentication.

ThiefMaster
  • 310,957
  • 84
  • 592
  • 636
0

And if you want to do regex, take a look here:

Stackoverflow: What is the best regular expression to check if a string is a valid URL?

Community
  • 1
  • 1
Gert Van de Ven
  • 997
  • 6
  • 6