1

Im working on StockX scraping some products. There is a popup element called sales history where I click the text link and then loop through all the sales history through the "Load More" button.

My problem is that for the most part this works fine as I loop through URL's, but occasionally it will get hung up for a really long time where the button is present, but is not clickable (hasn't reached bottom either) so I believe it just stays in the loop. Any help with either breaking this loop or some workaround in Selenium would be awesome thank you!!

enter image description here

This is the function that I use to open the sales history information:

url = "https://stockx.com/adidas-ultra-boost-royal-blue"
driver = webdriver.Firefox()
driver.get(url)
content = driver.page_source
soup = BeautifulSoup(content, 'lxml')


def get_sales_history():
    """ get sales history data from sales history table interaction """
    sales_hist_data = []

    try:
        # click 'View All Sales' text link
        View_all_sales_button = driver.find_element_by_xpath(".//div[@class='market-history-sales']/a[@class='all']")
        View_all_sales_button.click()

        # log in
        login_button = driver.find_element_by_id("nav-signup")
        login_button.click

        # add username
        username = driver.find_element_by_id("email-login")
        username.clear()
        username.send_keys("email@email.com")

        # add password
        password = driver.find_element_by_name("password-login")
        password.clear()
        password.send_keys("password")
    except:
        pass

    while True:
        try:
            # If 'Load More' Appears Click Button
            sales_hist_load_more_button = driver.find_element_by_xpath(
                ".//div[@class='latest-sales-container']/button[@class='button button-block button-white']")
            sales_hist_load_more_button.click()
        except:
            #print("Reached bottom of page")
            break

    content = driver.page_source
    soup = BeautifulSoup(content, 'lxml')

    div = soup.find('div', class_='latest-sales-container')

    for td in div.find_all('td'):
        sales_hist_data.append(td.text)

    return sales_hist_data
Andre
  • 360
  • 1
  • 7
  • 19

2 Answers2

1

You can wait for button to become clickable using explicit wait.

while True:
        try:
          # If 'Load More' Appears Click Button
            WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH,  ".//div[@class='latest-sales-container']/button[@class='button button-block button-white']"))).click()
    except StaleElementReferenceException:
        pass
    except TimeoutException:
        break

Also, note that I have used 2 different exception handling. In case some time you get stale element ( it will be possible as you are trying to click same button after page refresh) it will ignore an again try to click same button , but when element is not found for 20 Sec it will get time out exception and break.

rahul rai
  • 2,260
  • 1
  • 7
  • 17
  • what if the element exists like the image above but when clicked doesn't do anything? – Andre Sep 10 '20 at 18:02
  • As per your loop it will just pass and click again – rahul rai Sep 10 '20 at 18:04
  • Share URL of page I will try to load complete table . – rahul rai Sep 10 '20 at 18:05
  • url = "https://stockx.com/adidas-ultra-boost-royal-blue". The issue is that it will go through say 50-100 URL's and then all the sudden stall on a one of these pages where the button exists but its in an infinite loop. – Andre Sep 10 '20 at 18:22
  • 1
    I tried for above URL and able to click all load more. Only change i can suggest try to move mouse to load more before click. like ```loadmore = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, ".//div[@class='latest-sales-container']/button[@class='button button-block button-white']"))) ActionChains(driver).move_to_element(showmore).click().perform()``` – rahul rai Sep 11 '20 at 11:26
  • Do you to set this object to 'loadmore' though or just do the click and perform in loop? – Andre Sep 11 '20 at 13:48
0

To click on the element with text View All Sales within the Last Sale block and click on the Load More element to scrape all the sales history you need to induce WebDriverWait for the visibility_of_all_elements_located() and you can use the following based Locator Strategies:

  • Code Block:

    driver.get('https://stockx.com/adidas-ultra-boost-royal-blue')
    time.sleep(20) ## to interact with the location popup
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@class='last-sale-block']//a[text()='View All Sales']"))).click()
    while True:
      try:
          WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[@class='button button-block button-white' and text()='Load More']"))).click()
          print("Clicked on Load More")
          time.sleep(3)
      except (TimeoutException):
          print("No more Load More")
          break
    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='modal-body']//tbody//tr")))])
    
  • Console Output:

    Clicked on Load More
    Clicked on Load More
    Clicked on Load More
    No more Load More
    ['Sunday, August 2, 2020 2:16 AM EST 11 $236', 'Tuesday, June 2, 2020 7:34 AM EST 11 $262', 'Monday, April 27, 2020 11:03 AM EST 9 $143', 'Tuesday, January 7, 2020 8:54 AM EST 12.5 $137', 'Friday, December 27, 2019 12:30 PM EST 10 $307', 'Sunday, December 1, 2019 3:09 PM EST 8.5 $290', 'Tuesday, November 12, 2019 1:05 AM EST 12 $275', 'Tuesday, May 7, 2019 2:26 PM EST 8.5 $181', 'Saturday, April 27, 2019 1:04 PM EST 10 $228', 'Tuesday, March 5, 2019 12:25 AM EST 8.5 $230', 'Monday, November 5, 2018 1:35 AM EST 8 $320', 'Tuesday, August 28, 2018 7:29 PM EST 8.5 $240', 'Friday, August 24, 2018 10:26 PM EST 11 $580', 'Monday, July 16, 2018 10:02 PM EST 10.5 $255', 'Friday, July 6, 2018 2:44 PM EST 9 $260', 'Saturday, June 30, 2018 8:14 AM EST 9.5 $300', 'Tuesday, June 5, 2018 11:06 PM EST 10 $299', 'Saturday, May 12, 2018 10:48 AM EST 12 $371', 'Tuesday, March 20, 2018 1:09 AM EST 7.5 $279', 'Tuesday, March 20, 2018 11:17 PM EST 8 $250', 'Saturday, February 24, 2018 2:18 AM EST 7.5 $250', 'Monday, February 19, 2018 6:11 PM EST 7 $300', 'Sunday, February 18, 2018 2:05 PM EST 10 $400', 'Saturday, February 3, 2018 3:24 PM EST 7.5 $299', 'Thursday, January 25, 2018 11:13 PM EST 7 $190', 'Wednesday, December 27, 2017 11:09 PM EST 9 $355', 'Thursday, October 12, 2017 8:37 PM EST 8 $300', 'Friday, September 1, 2017 2:05 AM EST 12.5 $333', 'Friday, September 1, 2017 10:38 PM EST 12 $495', 'Saturday, August 5, 2017 10:53 AM EST 8 $355', 'Friday, August 4, 2017 3:28 AM EST 9.5 $325', 'Thursday, July 6, 2017 7:31 AM EST 10 $350', 'Tuesday, June 13, 2017 11:42 PM EST 9 $350', 'Monday, May 15, 2017 4:19 AM EST 11.5 $200', 'Sunday, May 14, 2017 3:42 PM EST 13 $370', 'Sunday, March 26, 2017 1:49 PM EST 11 $347', 'Sunday, August 21, 2016 7:33 PM EST 11 $250']
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • I'm not sure this works. There is "Load More" button in either View All Sales text elements. Your console output is only pulling a partial of the historical sales for that URL. If you exhaust the Load More button for this URL the date range goes from August 2, 2020 to August 20, 2016. – Andre Sep 10 '20 at 19:24
  • `['11', '$236', 'Sunday, August 2, 2020', '2:16 AM EST', '11', '$262', 'Tuesday, June 2, 2020', '7:34 AM EST', '9', '$143', ....... 'Sunday, March 26, 2017', '1:49 PM EST', '11', '$250', 'Saturday, August 20, 2016', '7:33 PM EST']` – Andre Sep 10 '20 at 19:39
  • @Andre Checkout the updated answer and let me know the status – undetected Selenium Sep 10 '20 at 20:05
  • Im trying it now with some edits, thanks Debanjan! I made some tweeks to fit in the function I have above that clicks the view all sales button and the rest built in. – Andre Sep 10 '20 at 20:22
  • I just started running it, but I believe your code is only doing 1 "Load More" click even if there are more "clicks" available. Sorry man! – Andre Sep 10 '20 at 20:52
  • @Andre Checkout the updated answer and let me know the status. – undetected Selenium Sep 10 '20 at 21:08
  • the issue for me really is coming across a stale button where there is still a button available to view more and more results but the button becomes inactive / stale. I've been able to load all and scrape with the code above but after about 50-100 it just stales and Im wondering how to pass with execution or something. – Andre Sep 11 '20 at 13:46