How to open and access multiple (nearly 50) tabs in Chrome using ChromeDriver and Selenium through Python

Question

I'm trying to gather some information from certain webpages using selenium and python.I have a working code for a single tab. But now i have a situation where i need to open 50 tabs in chrome at once and process each page data.

1) So open 50 tabs at once - The code i got already 2) Change the control between tabs and process the information from the page and close the tab and move to next tab and do the same.

from selenium import webdriver 
from selenium.webdriver.common.by import By 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
from selenium.common.exceptions import TimeoutException
import psycopg2
import os
import datetime

final_results=[]
positions=[]
saerched_url=[]

options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
#options.add_argument('--headless')
options.add_argument("—-incognito")
browser = webdriver.Chrome(executable_path='/users/user_123/downloads/chrome_driver/chromedriver', chrome_options=options)
browser.implicitly_wait(20)

#def db_connect():
try:
     DSN = "dbname='postgres' user='postgres' host='localhost' password='postgres' port='5432'"
     TABLE_NAME = 'staging.search_url'
     conn = psycopg2.connect(DSN)
     print("Database connected...")
     cur = conn.cursor()
     cur.execute("SET datestyle='German'")
except (Exception, psycopg2.Error) as error:
     print('database connection failed')
     quit()

def get_products(url):
    browser.get(url)
    names = browser.find_elements_by_xpath("//span[@class='pymv4e']")
    upd_product_name_list=list(filter(None, names))
    product_name = [x.text for x in upd_product_name_list]
    product = [x for x in product_name if len(x.strip()) > 2]
    upd_product_name_list.clear()
    product_name.clear()
    return product


links = ['https://www.google.com/search?q=Vitamin+D',
'https://www.google.com/search?q=Vitamin+D3',
'https://www.google.com/search?q=Vitamin+D+K2',
'https://www.google.com/search?q=D3',
'https://www.google.com/search?q=Vitamin+D+1000']

for link in links:
    # optional: we can wait for the new tab to open by comparing window handles count before & after
    tabs_count_before = len(browser.window_handles)

    # open a link
    control_string = "window.open('{0}')".format(link)
    browser.execute_script(control_string)

    # optional: wait for windows count to increment to ensure new tab is opened
    WebDriverWait(browser, 1).until(lambda browser: tabs_count_before != len(browser.window_handles))

    # get list of currently opened tabs
    tabs_list = browser.window_handles
    print(tabs_list)
    # switch control to newly opened tab (the last one in the list)
    last_tab_opened = tabs_list[len(tabs_list)-1]
    browser.switch_to_window(last_tab_opened)

    # now you can process data on the newly opened tab
    print(browser.title)


for lists in tabs_list:
    last_tab_opened = tabs_list[len(tabs_list)-1]
    browser.switch_to_window(last_tab_opened)
    filtered=[]
    filtered.clear()
    filtered = get_products(link)
    saerched_url.clear()
    if not filtered:
        new_url=link+'+kaufen'
        get_products(link) 
        print('Modified URL :'+link)

    if filtered:
        print(filtered)
        positions.clear()
        for x in range(1, len(filtered)+1):
            positions.append(str(x))
            saerched_url.append(link)

        gobal_position=0
        gobal_position=len(positions)
        print('global postion first: '+str(gobal_position))
        print("\n")

        company_name_list = browser.find_elements_by_xpath("//div[@class='LbUacb']")
        company = []
        company.clear()
        company = [x.text for x in company_name_list]
        print('Company Name:')
        print(company, '\n')


        price_list = browser.find_elements_by_xpath("//div[@class='e10twf T4OwTb']")
        price = []
        price.clear()
        price = [x.text for x in price_list]
        print('Price:')
        print(price)
        print("\n")

        urls=[]
        urls.clear()
        find_href = browser.find_elements_by_xpath("//a[@class='plantl pla-unit-single-clickable-target clickable-card']")
        for my_href in find_href:
            url_list=my_href.get_attribute("href")
            urls.append(url_list)

        print('Final Result: ')
        result = zip(positions,filtered, urls, company,price,saerched_url)
        final_results.clear()
        final_results.append(tuple(result))
        print(final_results)
        print("\n")


        print('global postion end :'+str(gobal_position))
        i=0
        try:
            for d in final_results:

                    while i <= gobal_position:
                      print( d[i])
                      cur.execute("""INSERT into staging.pla_crawler_results(position, product_name, url,company,price,searched_url) VALUES (%s, %s, %s,%s, %s,%s)""", d[i])
                      print('Inserted succesfully')
                      conn.commit()
                      i=i+1
        except (Exception, psycopg2.Error) as error:
                 print (error)
                 pass


    browser.close()

undetected Selenium · Answer 1 · 2020-01-24T11:29:51.047

-1

Ideally you shouldn't attempt to open 50 tabs at once as:

Handling 50 concurrent TABs through Selenium will invite complicated logic/algorithm to maintain.
Additionally, you may run into CPU and memory usage issues as:
- Chrome maintains many processes.
- Where as at times Firefox uses too much RAM

Solution

If you are having a List of the urls as follows:

['https://selenium.dev/downloads/', 'https://selenium.dev/documentation/en/']

You can iterate over the list to open them one by one in the adjacent tab for scraping using the following Locator Strategy:

Code Block:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException, WebDriverException
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support.select import Select
from selenium.webdriver.common.alert import Alert
from selenium.webdriver.common.keys import Keys

links = ['https://selenium.dev/downloads/', 'https://selenium.dev/documentation/en/']
options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
for link in links:
    driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
    driver.get(link)
    print(driver.title)
    print("Perform webscraping here")
    driver.quit()
print("End of program")

Console Output:

Downloads
Perform webscraping here
The Selenium Browser Automation Project :: Documentation for Selenium
Perform webscraping here
End of program

Reference

You can find a relevant detailed discussion in:

WebScraping JavaScript-Rendered Content using Selenium in Python

edited Jan 24 '20 at 11:29

answered Jan 24 '20 at 11:07

undetected Selenium

183,867
41
278
352

@Sandeep Checkout the answer update and let me know the status. – undetected Selenium Jan 24 '20 at 11:30
Many thanks for this effort :) I have checked this code,this code open links in tabs and process it and close and opens a new chrome and open with new link and process it till end, But this is not the results what we want. We need to open 50 tabs at once and that is fine ,The issue is that after opening those 50 tabs it only process the last active tab and after the end when i gave "browser.close()" it is getting "NoSuchWindowException". – Sandeep Jan 24 '20 at 11:44
So i need a solution to resolve that, close the last active window.Move control to next process it and close it and move to next tab and do the same..Any suggestions on what i need to do to achieve it? – Sandeep Jan 24 '20 at 11:45
Try using multi-processing or multithreading. – Jawad Ahmad Khan Nov 27 '21 at 08:23
@DebanjanB He was asking for simultaneously controlling all the tabs, which can be done only using the muti threaded or multiprocessing approach, while the answer focuses on a single tab approach which is irrelevant to the question asked. – Jawad Ahmad Khan Nov 27 '21 at 08:44
@JawadAhmadKhan Did you go through the complete answer at least once? Didn't you find the rationale why I have suggested single tab against nearly 50 odd tabs? I think my suggestion was back by some solid reasoning. Your thoughts please... – undetected Selenium Nov 27 '21 at 08:51
I understand it, I believe it's better to let someone else answer it using the muti processing approach which solves the exact problem someone having instead of an answer which doesn't help at all. I am writing an answer shortly, it may help. – Jawad Ahmad Khan Nov 27 '21 at 08:55
@JawadAhmadKhan Isn't _muti processing_ just another approach similar to single tab iterating over multiple tabs? In the question I don't even see _muti processing_ was a mandatory requirement. – undetected Selenium Nov 27 '21 at 09:00
Yes but it works simultaneously without any need to focus on any tab and switching. – Jawad Ahmad Khan Nov 27 '21 at 09:04
So you mean to say _Selenium_ works even without focus on specific tabs? Strange enough !!! – undetected Selenium Nov 27 '21 at 09:06

How to open and access multiple (nearly 50) tabs in Chrome using ChromeDriver and Selenium through Python

1 Answers1

Solution

Reference