0

I'm using selenium and python to test a website and I'm trying to get the link to the files on the site as follows: Use divs = find_elements_by_css_selector("div.answer") to get the posts on the page, this works fine. Use divs[i].find_element_by_xpath("//figure/a[1]").get_attribute("href) on each of the resulting elements from the last fetch. The site that I'm working on has this structure:

<html>
<div class="answer">
<blockquote class="message">
<figure class="thumb">
<a href="cdn.xyz.net/img1.jpg">
<img class="file-data" src="cdn.xyz.net/img1.jpg">
</a>
</figure>
</blockquote>
</div>
...... More identical divs with different thumbnails
</html>

The problem here is that the divs[i].find_element_by_xpath("//figure/a[1]").get_attribute("href") line always returns the first url of all the divs on the site in every iteration of the loop, in this case cdn.xyz.net/img1.jpg and this is not something that I'm trying to do as I would like to get the link for each div. My code that reproduces this problem is this:

try:
    elements = driver.find_elements_by_css_selector('div.answer')
    for el in elements: #For every reply
        embedLink = el.find_element_by_xpath("//figure[1]/a[1]")
        print("Found embed link: " + embedLink.get_attribute("href")) #this returns the first link every time
except:
    print("error")

What am I doing wrong here?

The amateur programmer
  • 1,238
  • 3
  • 18
  • 38

2 Answers2

1

xapth is searching from the root element, unless you tell it to start from current context using .//

el.find_element_by_xpath('.//figure[1]/a[1]')

You can also use the complete xpath to locate the elements

elements = driver.find_elements_by_xpath('//div[@class="answer"]//figure[1]/a[1]')
for el in elements: #For every reply
    print('Found embed link: ' + el.get_attribute('href'))
Guy
  • 46,488
  • 10
  • 44
  • 88
  • 1
    Thanks, just as I thought this was a very small error in my code, I did not find this information on the internet as everyone seems to use the whole site search without the leading `.` when using `xpath` – The amateur programmer Dec 25 '19 at 09:51
1

The line

embedLink = el.find_element_by_xpath("//figure[1]/a[1]")

selects globally all elements figure and takes the first one and from this, the first a element.

Solution:
Add a dot in front of the // to start searching for elements at the current node.

embedLink = el.find_element_by_xpath(".//figure[1]/a[1]")

See also this SO answer: "What is the difference between .// and //* in XPath?".

zx485
  • 28,498
  • 28
  • 50
  • 59