-1

I am trying to scrape a website and I am using following code:

import selenium
titles=[]
driver = webdriver.Chrome('chromedriver',chrome_options=chrome_options)
for i in range(len(links)):
  driver.get(links[i])
  time.sleep(0.5)
  data = driver.find_elements_by_xpath('.//a[@class = "question-hyperlink"]')
  titles.append(data[0].text)

I am running this code on google colab. The problem I am getting is that data doesn't store any value after loop iterates for some values. If I restart the kernel and rerun code then code works fine for the earlier iteration and the same issue occurs at another iteration. I am confused about why is it happening. I tried so many things but nothing work. Also, the size of links is large so is there any way to speed up things?

Edit: Added link to full code: https://colab.research.google.com/drive/1SYIA_SUPYzlR-K9ph4grNem-LbL61uB7?usp=sharing

2 Answers2

0

Try and use WebDriverWait(driver, 10).until(data) instead of time.sleep(0.5). This waits ten seconds for the element to be present.

XRaycat
  • 1,051
  • 3
  • 16
  • 28
  • But the size of links is 7000+ so won't it take too long if it waits for 10 seconds? – Tushar Agrawal Jul 31 '20 at 20:57
  • Now, I am getting a new error which was not there before, `Message: stale element reference: element is not attached to the page document`. What should I do? – Tushar Agrawal Jul 31 '20 at 21:02
  • Does the same error apper if you move driver = webdriver.Chrome('chromedriver',chrome_options=chrome_options) inside the for loop? – XRaycat Jul 31 '20 at 21:12
  • https://www.selenium.dev/exceptions/#stale_element_reference – XRaycat Jul 31 '20 at 21:15
  • You could also try to use: from selenium.webdriver.support import expected_conditions as C C.element_to_be_clickable – XRaycat Jul 31 '20 at 21:18
  • By keeping webdriver.chrome(...) in the loop it is taking too long and I cannot figure out if it will work or not after some iterations. Also, I need to do quicker. In the link it is said that try finding element multiple times, I added it 3 times but the issue is same. I also tried element to be clickable but same issue. Please check the link and help. I am doing this for hourse and I am tired of this now – Tushar Agrawal Jul 31 '20 at 21:42
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/219022/discussion-between-tushar-agrawal-and-xraycat). – Tushar Agrawal Aug 01 '20 at 02:37
0

Above the loop, you can set the implicit wait time of the driver:

driver.implicitly_wait(10) # 10 seconds for any element

This is similar to @XRayCat's answer.

Mike67
  • 11,175
  • 2
  • 7
  • 15