0

I'm scraping a webpage and for some reason it returns correctly the first 12 elements and not the remaining 24, for a total of 36 shown in the page.

search_names = driver.find_elements_by_class_name('offerList-item-description-title')
names = []
for name in search_names:
    names.append(name.text)

search_names has a length of 36, but it returns the following (Sample):

[1 , 2, 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , '', '', ... , '']

Any idea on why this might be happening?

Here's a snippet of the source code: Source Code Sample

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
Fox Wox
  • 11
  • 2

1 Answers1

0

To extract the texts from all of the elements with class as offerList-item-description-title using Selenium and you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using CLASS_NAME:

    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CLASS_NAME, "offerList-item-description-title")))])
    
  • Using CSS_SELECTOR:

    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.offerList-item-description-title")))])
    
  • Using XPATH:

    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='offerList-item-description-title']")))])
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Thanks for the help, but I'm getting this error: Traceback (most recent call last): print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CLASS_NAME, "offerList-item-description-title")))]) File "D:\Programas\Python\lib\site-packages\selenium\webdriver\support\wait.py", line 80, in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message: – Fox Wox Nov 24 '20 at 12:48
  • I have changed `get_attribute("innerHTML")` to `.text`. Please retest with xpath and css and let me know the status. – undetected Selenium Nov 24 '20 at 12:50
  • Got it working DebanjanB. Thank you so much! Would you recommend always using this method for storing elements in a list? Or just in case of invisible elements? – Fox Wox Nov 24 '20 at 13:51
  • @FoxWox Ideally you will have to keep the webdriver instance and the browser instance always in sync to perform activities using Selenium. Hence you need to induce proper type of [WebDriverWait](https://stackoverflow.com/questions/52603847/how-to-sleep-webdriver-in-python-for-milliseconds/52607451#52607451) – undetected Selenium Nov 24 '20 at 13:54