3

I wish to return only the number of google search results for a particular keyword in the fastest manner possible, avoiding (keeping to minimum) the use of third party libraries. I have already considered xgoogle.

subiet
  • 1,399
  • 1
  • 12
  • 18

2 Answers2

2

Take a look at Alex Martelli's example.

If you search for something vague like "cars", data will look something like the following. Notice that it isn't very long; you only get the top few hits, and a link to "moreResultsUrl". Therefore, it should be reasonably fast to make this query and look in data['cursor']['estimatedResultCount'] for the estimated number of hits.

{'cursor': {'currentPageIndex': 0,
            'estimatedResultCount': '168000000',
            'moreResultsUrl': 'http://www.google.com/search?oe=utf8&ie=utf8&source=uds&start=0&hl=en&q=cars',
            'pages': [{'label': 1, 'start': '0'},
                      {'label': 2, 'start': '4'},
                      {'label': 3, 'start': '8'},
                      {'label': 4, 'start': '12'},
                      {'label': 5, 'start': '16'},
                      {'label': 6, 'start': '20'},
                      {'label': 7, 'start': '24'},
                      {'label': 8, 'start': '28'}]},
 'results': [ <<list of 4 dicts>> ]}
Community
  • 1
  • 1
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
-1

You can use urllib for downloading the site and HTMLParser to parse out the <div id="resultStats">....</div> values. Here is an example:

How can I use the python HTMLParser library to extract data from a specific div tag?

Community
  • 1
  • 1
zoli2k
  • 3,388
  • 4
  • 26
  • 36