1

I'm trying to extract the names of 1268 companies from the website of industrial fair which has uploaded the exhibitors list on this page.

Unfortunately selenium seems not to find the elements into the specific inner part of the webpage that contains companies' names and that has its own scrollbar.

Page rendering

Here's my code:

g = webdriver.Chrome()
g.get("https://www.ecomondo.com/elenco-espositori/espositori-ecomondo")
g.maximize_window()
time.sleep(2)
cookie = WebDriverWait(g,15).until(
    EC.presence_of_element_located((By.XPATH, '//*[@id="c-p-bn"]'))
)
cookie.click()
time.sleep(5)
element = g.find_element_by_class_name('sc-1aq2rfp-0 sc-li856a-3 eqfJYB euBeDv')
ActionChains(g).move_to_element_with_offset(element, 0, 0).perform()

company_name = g.find_elements_by_xpath('//*[@id="__next"]/div[3]/div/div/div/div/div/a/div/div/span[1]')
print(company_name)

I also tried finding the element by xpath but the result is the same:

Message: no such element: Unable to locate element: {"method":"css selector","selector":".sc-1aq2rfp-0 sc-li856a-3 eqfJYB euBeDv"}

After find the element I should scroll the sidebar down to make all the 1268 companies' name visible and eventually extract them but these are other stories.

Any hints?

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352

1 Answers1

1

The desired elements are within an <iframe> so you have to:

  • Induce WebDriverWait for the desired frame to be available and switch to it.

  • Induce WebDriverWait for the desired elements to be visible.

  • You can use the following Locator Strategies:

    driver.get('https://www.ecomondo.com/elenco-espositori/espositori-ecomondo')
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[text()='ACCETTA TUTTI I COOKIE E CONTINUA']"))).click()
    WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[starts-with(@src, 'https://ecomondo.app.swapcard.com/widget/event/ecomondo-and-key-energy-2020/exhibitors')]")))
    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='infinite-scroll-component ']//a//following::span[1]")))])
    
  • Console Output:

    ['2LNG', 'D5/080', '3M ITALIA SRL', 'D3/011', '3META SRL', 'C6/002', '3U VISION SRL', 'A3/178', '3V GREEN EAGLE SPA', 'C1/105', '4 ESSE SRL', 'B4/007', '4SERVICE EUROPE S.R.L.', 'A3/027', '9-TECH SRL', 'SUD/064', 'A.B.C. BILANCE SRL', 'A2/068', 'A.C.R. DI REGGIANI ALBERTINO SPA', 'C1/152', 'A.E.C. SRL', 'A7/014', 'A.I.D.P.I. ASSOCIAZIONE IMPRESE DISINFESTAZIONEPROFESSIONALI ITALIANE', 'A5-C5/002 F', 'A.I.R.E.C. - ASSOCIAZIONE ITALIANA DEL RECUPERO ENERGETICO DA COMBUSTIBILI SOLIDI SECONDARI', 'B1/013', 'A.M.S. SPA - ATTREZZATURE MECCANICHE SPECIALI', 'C7/003', 'A.T. RICAMBI SRL', 'B1/164', 'A.U. ESSE SRL', 'A5/036', 'A2A ENERGIA SPA', 'B5/040', 'A2A ENERGY SOLUTIONS SRL', 'B5/040', 'A2A SPA', 'B5/040', 'AB ENERGY SPA', 'B5-D5/005', "ABICERT L' ENTE DI CERTIFICAZIONE", 'C4/009', "ACCIAI DI QUALITA' SPA", 'A1/058', 'ACEA SPA', 'D1/160', 'ACQUEDOTTO PUGLIESE SPA', 'D2/002', 'ACR+', 'B5/008', 'ACTA SRL', 'C1/002', 'ADAMBÍ - ADGENERA SRL', 'A5/143', 'ADAMOLI SRLS', 'C5/140', 'ADDAX MOTORS NV', 'A6/022', 'ADEME AGENCE DE LA TRANSITION ECOLOGIQUE', 'ADICOMP S.R.L.', 'D5/133', 'ADRIAECO', 'ADRIATICA ACQUE SRL', 'B4/029', 'AEBI SCHMIDT ITALIA SRL', 'A7/002', 'AEBIG - ASOCIACION ESPANOLA DE BIOGAS', 'ÆVOLUTION MATEUSZ WIELOPOLSKI CONSULTING', 'B5/045', 'AFFILOR SRL', 'A1/001', 'AGECO DUE SPA', 'C1/186', 'AGENCE NATIONALE DES DECHETS', 'B5/008', 'AGENDA SRL - EDIZIONI ACQUAGENDA, GASAGENDA, WATERGAS.IT', 'D3/192', "AGENZIA REGIONALE PER LA TUTELA DELL'AMBIENTE", 'D4/032', 'AGER PUGLIA', 'D2/002', 'AGRICOLUS SRL', 'SUD/063', 'AGRICOM SRL', 'D5/157', 'AGRIGARDEN AMBIENTE SRL A SOCIO UNICO', 'C1/056', 'AGRIPLAST SRL', 'C1/139', 'AGRITECH SRL', 'D4/011', 'AGROTEL GMBH', 'D5/029', 'AIMAG SPA', 'B3/109', 'AIR CLEAN SRL', 'A3/101']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

Reference

You can find a couple of relevant discussions in:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • This answer is very easy-to-understand and professional, thank You. BUT...even if now i am able to get companies' names my next task is to scroll down the – Leonardo Acquaroli Mar 12 '22 at 11:10
  • UPDATE: Today also the code you provided return a TimeoutException generated by WebDriverWait(g, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='infinite-scroll-component ']//a//following::span[1]"))) – Leonardo Acquaroli Mar 12 '22 at 12:33
  • And this happens when I try to manually scroll the iframe element pausing the programme with an input() command. – Leonardo Acquaroli Mar 12 '22 at 12:48