1

i'm trying to make a imdb ratingchecker on my website. i've come that far so i can use parts of the URL to chack the rating and print it on my page using cURL+php+ajax/jQuery. but it just needs the last part of the imdb-link to function correctly and use omdbapi.com's function to rip out the rating.

but if the user inputs http://www.imdb.com/title/tt2268016/ it will not work. but if the user inputs only tt2268016 the function work like a charm. so my problem is that i don't realy know how to either remove http://www.imdb.com/title/ and leave the tt2268016 by and use that.

the code that's doing the url handling is this:

 
    if (empty($_POST['imdbnum'])){
    $imdb = 'none';
    $imdblink = 'none';
    }else{
    $imdb = unesc($_POST['imdbnum']);

function get_movie_ratings($name)
{
$url = "https://omdbapi.com/?i=".urlencode($name);
// send request 
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$curlData = curl_exec($curl);
curl_close($curl);

return json_decode($curlData, true);
}
$imdb2 = get_movie_ratings($imdb);
$imdbrate = $imdb2["imdbRating"];
$imdblink = unesc($_POST['imdbnum']);
} 

how do i solve this in a smooth way?

Best Regards!

conny
  • 13
  • 6
  • Scraping in this manner is a violation of the IMDB terms. http://stackoverflow.com/a/1966526/1902010 – ceejayoz Dec 24 '15 at 22:53

1 Answers1

0

The simplest way is to use str_replace to remove the extra characters:

if (empty($_POST['imdbnum'])){
    $imdb = 'none';
    $imdblink = 'none';
} else {
    $imdb = unesc($_POST['imdbnum']);
    $imdb = str_replace('http://www.imdb.com/title/', '', $imdb);
    $imdb = str_replace('/', '', $imdb);
}

To account for a wider variety of URL variations, you can use preg_match:

if (empty($_POST['imdbnum'])){
    $imdb = 'none';
    $imdblink = 'none';
} else {
    $imdb = unesc($_POST['imdbnum']);
    if (preg_match('/tt\d+/', $imdb, $search)) {
        $imdb = $search[0];
    } else {
        $imdb = 'none';
        $imdblink = 'none';
    }
    echo $imdb;
}
John C
  • 8,223
  • 2
  • 36
  • 47
  • is there any way to remove https://imdb.com/title/ , www.imdb.com/title/ , imdb.com/title.com to? tried to extend the str_replace with $imdb = unesc($_POST['imdbnum']); $imdb = str_replace('http://www.imdb.com/title/', '', $imdb); $imdb = str_replace('www.imdb.com/title/', '', $imdb); $imdb = str_replace('imdb.com/title/', '', $imdb); but that did not do the trick. any suggestions? – conny Dec 25 '15 at 17:56
  • @conny I wondered if that would be an issue - I've added a more advanced search using regex - note that it assumes you are always searching for an ID starting with "tt" followed by some numbers – John C Dec 26 '15 at 01:17
  • thank you. now my users can add any url as they want, as long it contains the tt stuff. i had to remove echo $imdb } because it printed out the tt stuff in the handling file. but it works like a charm now! – conny Dec 26 '15 at 18:13