I am trying to retrieve all the links to the posts of on instagram account. The structure is a bit nested: first I find the class by X_Path where all of those links are located and then I iterate over web_elements( posts) to extract the links. However, this approach throws the Stale Element Reference.
My question is: How should I design a loop with WebDriverWait implementation with By.CSS_Selector to extract links and store them in one list?
I've read and tried to implement the WebDriverWait, yet I am stuck doing that properly since all the attempts do not seem to work.
I've search for the questions and have found two links that were very helpful, however none of those deal with By.CSS_SELECTOR to extract a href
.
These are the links: StaleElementException when iterating with Python
My current code that goes in infinite loop:
def getting_comment(instagram_page, xpath_to_links, xpath_to_comments ):
global allComments
links = []
scheight = .1
posts = []
browser= webdriver.Chrome('/Users/marialavrovskaa/desktop/chromedriver')
browser.get(f"{instagram_page}")
while scheight < 9.9:
browser.execute_script("window.scrollTo(0, document.body.scrollHeight/%s);" % scheight)
scheight += .01
posts = browser.find_elements_by_xpath(f"//div[@class='{xpath_to_links}']")
for elem in posts:
while True:
try:
WebDriverWait(elem, 20).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".a")))
links.append(elem.find_element_by_css_selector('a').get_attribute('href'))
except TimeoutException:
break
instagram_page = https://www.instagram.com/titovby/?hl=ru
xpath_to_links = v1Nh3 kIKUG _bz0w