6

I noticed when running wget https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=foo and similar queries, I don't get the search results, but the google homepage.

There seems to be some redirect within the google page. Does anyone know a fix to wget so it would work?

2 Answers2

12

You can use this curl commands to pull Google query results:

curl -sA "Chrome" -L 'http://www.google.com/search?hl=en&q=time' -o search.html

For using https URL:

curl -k -sA "Chrome" -L 'https://www.google.com/search?hl=en&q=time' -o ssearch.html

-A option sets a custom user-agent Chrome in request to Google.

anubhava
  • 761,203
  • 64
  • 569
  • 643
7

#q=foo is your hint, as that's a fragment ID, which never gets sent to the server. I'm guessing you just took this URL from your browser URL-bar when using the live-search function. Since it is implemented with a lot of client-side magic, you cannot rely on it to work; try using Google with live search disabled instead. A URL pattern that seems to work looks like this: http://www.google.com/search?hl=en&q=foo.

However, I do notice that Google returns 403 Forbidden when called naïvely with wget, indicating that they don't want that. You can easily get past it by setting some other user-agent string, but do consider all the implications before doing so on a regular basis.

Dolda2000
  • 25,216
  • 4
  • 51
  • 92