I'm trying to get the title of a website that is entered by the user.
Text input: website link, entered by user is sent to the server via AJAX.
The user can input anything: an actual existing link, or just single word, or something weird like 'po392#*@8'
Here is a part of my PHP script:
// Make sure the url is on another host
if(substr($url, 0, 7) !== "http://" AND substr($url, 0, 8) !== "https://") {
$url = "http://".$url;
}
// Extra confirmation for security
if (filter_var($url, FILTER_VALIDATE_URL, FILTER_FLAG_HOST_REQUIRED)) {
$urlIsValid = "1";
} else {
$urlIsValid = "0";
}
// Make sure there is a dot in the url
if (strpos($url, '.') !== false) {
$urlIsValid = "1";
} else {
$urlIsValid = "0";
}
// Retrieve title if no title is entered
if($title == "" AND $urlIsValid == "1") {
function get_http_response_code($theURL) {
$headers = get_headers($theURL);
if($headers) {
return substr($headers[0], 9, 3);
} else {
return 'error';
}
}
if(get_http_response_code($url) != "200") {
$urlIsValid = "0";
} else {
$file = file_get_contents($url);
$res = preg_match("/<title>(.*)<\/title>/siU", $file, $title_matches);
if($res === 1) {
$title = preg_replace('/\s+/', ' ', $title_matches[1]);
$title = trim($title);
$title = addslashes($title);
}
// If title is still empty, make title the url
if($title == "") {
$title = $url;
}
}
}
However, there are still errors occuring in this script.
It works perfectly if an existing url as 'https://www.youtube.com/watch?v=eB1HfI-nIRg' is entered and when a non-existing page is entered as 'https://www.youtube.com/watch?v=NON-EXISTING', but it doesn't work when the users enters something like 'twitter.com' (without http) or something like 'yikes'.
I tried literally everthing: cUrl, DomDocument...
The problem is that when an invalid link is entered, the ajax call never completes (it keeps loading), while it should $urlIsValid = "0" whenever an error occurs.
I hope someone can help you - it's appreciated.
Nathan