How to scrape SERP with PHP (for small project)

Question

I thought this would be fairly simple but it's proving challenging. Google uses https:// now and bing redirects to remove HTTP://.

How can I grab the top 5 URLs for a given search term?

I've tried several methods (including loading results into an iframe), but keep hitting brick walls with everything I try.

I wouldn't even need a proxy, as I'm talking about a very small amount results to be harvested, and will only use it for 20-30 terms once ever few months. Hardly enough to trigger whiplash from the search giants.

Any help would be much appreciated!

Here's one example of what I've tried:

$query = urlencode("test"); 

preg_match_all('/<a title=".*?" href=(.*?)>/', file_get_contents("http://www.bing.com/search?q=" . urlencode($query) ), $matches); 

echo implode("<br>", $matches[1]);

[Wouldn't you prefer an HTML Parser instead?](http://stackoverflow.com/a/1732454/102937) — Robert Harvey, Nov 15 '13 at 22:14
For such a small amount of data, wouldn't a paper and pencil suit you? — , Nov 15 '13 at 22:15
I have http://sourceforge.net/projects/simplehtmldom/ but can't seem to use it properly. All I really need is the `` tags from Bing's SERP. — Casey Dwayne, Nov 15 '13 at 22:16
@MikeW The point is to make it automated, so I don't have to manually retrieve the top 5 or so URLs for each of the 20-30 terms. Work hard now, work easy later. — Casey Dwayne, Nov 15 '13 at 22:17
Take a look here http://stackoverflow.com/questions/22657548/is-it-ok-to-scrape-data-from-google-results — John, Jan 13 '17 at 00:20

score 5 · Accepted Answer · answered Nov 16 '13 at 00:52

There's three main ways to do this. Firstly, use the official API for the search engine you're using - Google has one, and most of them will. These are often volume limited, but for the numbers you're talking about, you'll be fine.

The second way is to use a scraper program to visit the search page, enter a search term, and submit the associated form. Since you've specified PHP, I'd recommend Goutte. Internally it uses Guzzle and Symfony Components, so it must be good! The README at the above link shows you how easy it is. Selection of HTML fragments is done using either XPath or CSS, so it is flexible too.

Lastly, given the low volume of required scrapes, consider downloading a free software package from Import.io. This lets you build a scraper using a point-and-click interface, and it learns how to scrape various areas of the page before storing the data in a local or cloud database.

Cool, thank you for taking the time to answer with plenty of options. — Casey Dwayne, Nov 17 '13 at 15:40

score 1 · Answer 2 · answered May 25 '18 at 22:50

You can also use a third party service like Serp Api to get Google results.

It should be pretty easy to integrate::

$query = [
    "q" => "Coffee",
    "google_domain" => "google.com",
];

$serp = new GoogleSearchResults();
$json_results = $serp.json($query);

GitHub project.

How to scrape SERP with PHP (for small project)

2 Answers2