0

If you go to https://www.google.com/movies?near=02215&q=revenant you will clearly get results returned to you that can be parsed. However, when I upload a small script to my server like:

<?php 

$string = file_get_contents("https://www.google.com/movies?near=02215&q=revenant");

echo $string;

?>

The output of the results is something along the lines of "No results found"

Any ideas?

labago
  • 1,338
  • 2
  • 12
  • 28
  • When I tested that URL in Chrome, I got "No hits". – M. Eriksson Jan 20 '16 at 16:44
  • `Your query - revenant - did not match any movie reviews, showtimes or theaters.` - looks like it's the URL you're hitting – iamgory Jan 20 '16 at 16:44
  • I get this http://i.imgur.com/f6291fa.png – labago Jan 20 '16 at 16:45
  • The problem still stands though, no matter WHAT the query, it never returns results when done from the server – labago Jan 20 '16 at 16:45
  • @MagnusEriksson [I got hits](http://i.stack.imgur.com/HJ3xY.png) – MonkeyZeus Jan 20 '16 at 16:46
  • @jlane09 It's very possible that Google is blocking scraping attempts. – MonkeyZeus Jan 20 '16 at 16:48
  • I need to change the location to my zip code to get results. Then it also adds a `&stok=xxxxx` to the URL. When removing the `&stok=xxxx` from the URL i get no result anymore – M. Eriksson Jan 20 '16 at 16:48
  • Seem to be some type of CSRF-token or something. – M. Eriksson Jan 20 '16 at 16:49
  • 1
    @jlane09 Looks like you are not the first http://stackoverflow.com/questions/9379555/google-api-movie-showtimes-documentation – MonkeyZeus Jan 20 '16 at 16:49
  • I'm in Costa Rica ~ I get no results for the original query, but I do get results for https://www.google.com/movies?near=14139&q=daddy which is upstate NY. – Tarek Adam Jan 20 '16 at 16:51
  • @MonkeyZeus how would they block that? How would they know? Also, that question was similar but didn't seem to be having the issues I am. It was asking for documentation. On that note, does anyone know what happens if your server gets "flagged" by google – labago Jan 20 '16 at 16:53

1 Answers1

1

Google is picky about who they send HTTP responses to, tell them you're a browser, see if this works:

$context = stream_context_create(array(
    'http' => array(
        'method' => "GET",
        'header' => "" .
            "Accept: text/html" . "\r\n" .
            "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0" . "\r\n"
    )
));

$string = file_get_contents( "https://www.google.com/movies?near=02215&q=revenant", false, $context );
Rainner
  • 589
  • 3
  • 7
  • @jlane09 damn, i was getting blank results at first, but worked for me after I added the headers. Try using CURL. – Rainner Jan 20 '16 at 17:14
  • maybe my server has already been put on some list, but I will try with curl too – labago Jan 20 '16 at 17:17