0

Have been trying to use this xgoogle to search for pdfs on the internet.. the problem am having is that if i search for "Medicine:pdf" the first page returns to me is not the first page google returns,i.e if i actually use google.... dont know whats wrong here is ma code

     try:
         page = 0   
         gs = GoogleSearch(searchfor)
         gs.results_per_page = 100
         results = []
         while page < 2:
             gs.page=page
             results += gs.get_results()
             page += 1
     except SearchError, e:
            print "Search failed: %s" % e             
     for res in results:
         print res.desc

if i actually use google website to search for the query the first page google display for me is : Title : Medicine - British Council
Desc :United Kingdom medical training has a long history of excellence and of ... Leaders in medicine throughout the world have received their medical education.
Url : http://www.britishcouncil.org/learning-infosheets-medicine.pdf
But if I used my python Xgoogle Search I get :
Python OutPut
Descrip:UCM175757.pdf
Title:Medicines in My Home: presentation for students - Food and Drug ...
Url:http://www.fda.gov/downloads/Drugs/ResourcesForYou/Consumers/BuyingUsingMedicineSafely/UnderstandingOver-the-CounterMedicines/UCM175757.pdf

galaxyan
  • 5,944
  • 2
  • 19
  • 43
Slipstream
  • 15
  • 4

1 Answers1

0

I noticed it is difference between using xgoogle and using google in browser. I have no idea why, but you could try the google custom search api. The google custom search api may give you more close result and no risk of banned from google(if you use xgoogle to many times in one short period, you have an error return instead of search result).

first you have to register and enable your custom search in google to get key and cx https://www.google.com/cse/all

the api format is:

'https://www.googleapis.com/customsearch/v1?key=yourkey&cx=yourcx&alt=json&q=yourquery'

  • customsearch is the google function you want to use, in your case I think it is customsearch
  • v1 is the version of you app
  • yourkey and yourcx are provided from google you could find it on you dashboard
  • yourquery is the term you want to search, in your case is "Medicine:pdf"
  • json is the return format

example return the first 3 pages of google custom search results:

import urllib2
import urllib
import simplejson
    def googleAPICall():    
        userInput = urllib.quote("global warming")    
        KEY = "##################"  # get yours
        CX = "###################"  # get yours

        for i in range(0,3):
            index = i*10+1 
            url = ('https://scholar.googleapis.com/customsearch/v1?'    
                   'key=%s'
                   '&cx=%s'
                   '&alt=json'
                   '&q=%s'
                   '&num=10'
                   '&start=%d')%(KEY,CX,userInput,index)  

            request = urllib2.Request(url)
            response = urllib2.urlopen(request)
            results = simplejson.load(response)
galaxyan
  • 5,944
  • 2
  • 19
  • 43