Print google search results using selenium in Python

Question

I have this code to print some search results to the console:

import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

browser = webdriver.Chrome('/Users/Downloads/chromedriver')
browser.get('http://www.google.com')
search = browser.find_element_by_name('q')
search.send_keys("youtube")
search.send_keys(Keys.RETURN)
print(browser)
time.sleep(10)
browser.quit()

The output is incorrect. Why?

Try help(browser) for a list of appropriate methods and data contained in the browser object. It sounds like you can't cast browser to a string. There is likely another embedded object that contains the results you desire. — h0r53, Aug 17 '17 at 14:20
What exactly is incorrect about the result? Please read [ask] and [mre] and https://ericlippert.com/2014/03/05/how-to-debug-small-programs/. — Karl Knechtel, Aug 08 '22 at 01:33

score 3 · Answer 1 · edited Mar 15 '18 at 03:16

I wrote a simple class which you can use, you just need to change the path to webdriver. It was made for PhantomJS (You can download it here.), but if you want to use Chrome (or any other webdriver) just replace line self.driver = webdriver.PhantomJS(path) with self.driver = webdriver.Chrome(path). Below is code example:

import time
from urllib.parse import quote_plus
from selenium import webdriver


class Browser:

    def __init__(self, path, initiate=True, implicit_wait_time = 10, explicit_wait_time = 2):
        self.path = path
        self.implicit_wait_time = implicit_wait_time    # http://www.aptuz.com/blog/selenium-implicit-vs-explicit-waits/
        self.explicit_wait_time = explicit_wait_time    # http://www.aptuz.com/blog/selenium-implicit-vs-explicit-waits/
        if initiate:
            self.start()
        return

    def start(self):
        self.driver = webdriver.PhantomJS(self.path)
        self.driver.implicitly_wait(self.implicit_wait_time)
        return

    def end(self):
        self.driver.quit()
        return

    def go_to_url(self, url, wait_time = None):
        if wait_time is None:
            wait_time = self.explicit_wait_time
        self.driver.get(url)
        print('[*] Fetching results from: {}'.format(url))
        time.sleep(wait_time)
        return

    def get_search_url(self, query, page_num=0, per_page=10, lang='en'):
        query = quote_plus(query)
        url = 'https://www.google.hr/search?q={}&num={}&start={}&nl={}'.format(query, per_page, page_num*per_page, lang)
        return url

    def scrape(self):
        #xpath migth change in future
        links = self.driver.find_elements_by_xpath("//h3[@class='r']/a[@href]") # searches for all links insede h3 tags with class "r"
        results = []
        for link in links:
            d = {'url': link.get_attribute('href'),
                 'title': link.text}
            results.append(d)
        return results

    def search(self, query, page_num=0, per_page=10, lang='en', wait_time = None):
        if wait_time is None:
            wait_time = self.explicit_wait_time
        url = self.get_search_url(query, page_num, per_page, lang)
        self.go_to_url(url, wait_time)
        results = self.scrape()
        return results




path = '<YOUR PATH TO PHANTOMJS>/phantomjs-2.1.1-windows/bin/phantomjs.exe' ## SET YOU PATH TO phantomjs
br = Browser(path)
results = br.search('site:facebook.com inurl:login')
for r in results:
    print(r)

br.end()

Shubham Jain · Answer 2 · 2017-08-17T14:31:27.500

In java it will be something like below :-

List<WebElement> print = driver.findElements(By.xpath("//div[@class='sbqs_c']"));
System.out.println(print.size());
for ( WebElement we: print) { 
    System.out.println(we.getText());
}

I am not a python guy but may it will be like :-

    browser = webdriver.Chrome('/Users/Downloads/chromedriver')
    browser.get('http://www.google.com')
    search = browser.find_element_by_name('q')
    search.send_keys("youtube")
       ids = driver.find_elements_by_xpath("//div[@class='sbqs_c']")
       for ii in ids:
       #print ii.text
       print ii.text

Source :- Iterate a list with indexes in Python

Hope it will help you :)

This question is asked about Python code and you say that your answer is based on one that was also written for Python... so why show any Java code? — Karl Knechtel, Aug 08 '22 at 01:34

score -1 · Answer 3 · answered Aug 17 '17 at 15:09

-1

soup=BeautifulSoup(html)
for link in soup.find_all('a'):
    print(link.get('href'))

Found the answer to my own question using Beautiful soup

answered Aug 17 '17 at 15:09

Javaman

17
2
5

Print google search results using selenium in Python

3 Answers3