I am trying to make a scraper that will go through a bunch of links, export the guide as a PDF, and loop through all the guides that are in the parent folder. It works fine going in, ,but when I try to go backwards, it throws stale exceptions, even when I make sure to refresh the elements in the code, or refresh the page.
from selenium import webdriver
import time, bs4
browser = webdriver.Firefox()
browser.get('MYURL')
loginElem = browser.find_element_by_id('email')
loginElem.send_keys('LOGIN')
pwdElem = browser.find_element_by_id('password')
pwdElem.send_keys('PASSWORD')
pwdElem.submit()
time.sleep(3)
category = browser.find_elements_by_class_name('title')
for i in category:
i.click()
time.sleep(3)
guide = browser.find_elements_by_class_name('cell')
for j in guide:
j.click()
time.sleep(3)
soup = bs4.BeautifulSoup(browser.page_source, features="html.parser")
guidetitle = soup.find_all(id='guide-intro-title')
print(guidetitle)
browser.find_element_by_link_text('Options').click()
time.sleep(0.5)
browser.find_element_by_partial_link_text('Download PDF').click()
browser.find_element_by_id('download').click()
browser.execute_script("window.history.go(-2)")
print("went back")
time.sleep(5)
print("waited")
guide = browser.find_elements_by_class_name('thumb')
print("refreshed elements")
print("made it to outer loop")
This happens if I both use a script to move the browser back, or the driver.back() method. I can see that it makes it back to the child directory, then waits, and refreshes the elements. But, then it can't seem to load the new element to go into the next guide. I found a similar questions here on SO but someone just provided code tailored to the problem instead of explaining so I am still confused.
I also know about using waitdriver but I am just using sleep now since I don't fully understand the EC wait conditions. In any case, increasing the sleep time doesn't fix this issue.