0

The element I'm looking to find looks like this:

<a href="pic:/82eu92e/iwjd/" data-superid="picture-link">

Previously I found all href's in the page, then found the correct href by finding which one had the text pic:, but I can't do this any longer due to some pages having scrolling galleries causing stale elements.

August Kimo
  • 1,503
  • 8
  • 17

3 Answers3

1

You could try beautifulsoup + selenium, like:

from bs4 import BeautifulSoup

text = '''<a href="pic:/82eu92e/iwjd/" data-superid="picture-link">'''
# Under your circumstance, you need to use:
# text = driver.page_source
soup = BeautifulSoup(text, "html.parser")
print(soup.find("a", attrs={"data-superid":"picture-link"}))

Result:

<a data-superid="picture-link" href="pic:/82eu92e/iwjd/"></a>
jizhihaoSAMA
  • 12,336
  • 9
  • 27
  • 49
  • Thanks I tried but it returned `None`, probably due to me tampering it to make the pic value a wildcard. Got it working through other method. – August Kimo Sep 15 '20 at 13:19
1

You can filter by attribute:

driver.find_element_by_xpath('//a[@data-superid="picture-link"]')

Regarding the scrolling part, here is a previously asked question that can help you.

charbel
  • 471
  • 1
  • 5
  • 18
  • By the way I can scroll pages fine! I meant there was a gallery scrolling left/right automatically that would cause stale elements. – August Kimo Sep 15 '20 at 13:31
0

To Extract the href value using data-superid="picture-link" use following css selector or xpath.

links=driver.find_elements_by_css_selector("a[data-superid='picture-link'][href]")
for link in links:
    print(link.get_attribute("href"))

OR

links=driver.find_elements_by_xpath("//a[@data-superid='picture-link'][@href]")
for link in links:
    print(link.get_attribute("href"))
KunduK
  • 32,888
  • 5
  • 17
  • 41