0

I'm trying to build a web scraper that can deal with the show more feature of the NY Times search results for a research project I'm taking part in.

I'm currently working on a driver to click the show more button until all the results pages are loaded and then I will scrape each link with beautiful soup.

The problem is that I can't get the selenium driver to click the "show more" button.

when I run the code below, the browser pops up, then closes and I get an output of "no more show more button"

I have checked that the CSS selector doesn't change with each click and have tried using an XPATH.

I'd really appreciate any insight you might have:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time

chrome_path = r"/Users/user_name/Desktop/chromedriver_mac64/chromedriver"
service = Service(chrome_path)
driver = webdriver.Chrome(service=service)
driver.get("https://www.nytimes.com/search?dropmab=false&endDate=20230321&query=covid-19&sort=best&startDate=20200101")
time.sleep(3)
while True:
    try:       
        show_more = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.css-vsuiox button'))).click()
        driver.execute_script("arguments[0].click();", show_more)
        print("Show more button clicked")
        time.sleep(2)
    except:
        print("No more Show more button")
        break

I expected the script to open the browser and click the "show more" button until every page was loaded.

The output I got was:

no more show more button
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
canurdon
  • 1
  • 1

1 Answers1

0

You were close enough. To click on the element Show More instead of visibility_of_element_located() you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following locator strategies:

  • Code Block:

    driver.get("https://www.nytimes.com/search?dropmab=false&endDate=20230321&query=covid-19&sort=best&startDate=20200101")
    while True:
        try:       
            WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'button[data-testid=search-show-more-button]'))).click()
            print("Show more button clicked")
        except TimeoutException:
            print("No more Show more button")
            break
    
  • Console Output:

    Show more button clicked
    Show more button clicked
    Show more button clicked
    Show more button clicked
    Show more button clicked
    ...
    ...
    ...
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352