3

I am new to web scraping and I try to open a link with selenium:

On Google Chrome I use inspect on the button I want to open and I get the following information:

<a href="/c#candidates?id=a6b0e325a499&amp;candidateFilter=4af15d8991a8" data-tn-link="true" data-tn-element="view-unread-candidates"><span class="jobs-u-font--bold">(4 awaiting review)</span></a>

I try to get all the links with the same structure and open it so I can access its data

enter image description here enter image description here

(I have several button with the same structure but different href that I need to see)

Also under Properties I can see a for the same button.

However I want to be more precise than just using as I want only those particular links mentioned above:

elements = driver.find_elements_by_tag_name("a")

Can anyone advise ?

Solal
  • 611
  • 2
  • 9
  • 26
  • Can you show an additional link or two so we can see what remains fixed and what changes between candidates? For example, is it only unread you care about? It sounds like you really want to be able to match all qualifying a tags and then click - will you be passing a list of ids for example? – QHarr May 11 '19 at 03:49
  • Thanks I added image so you can see the structure. Actually candidateFilter is not the issue here. – Solal May 12 '19 at 21:05
  • This solved now? – QHarr May 12 '19 at 21:06
  • not really didn't make it work I think I get the elements with xpath: ```xpath = "/html/body/div[@id='wrapper']/div[@id='page_frame']/div[@id='page_content']/div[@class='page-wrapper']/div[@id='mc']/div[@id='plugin_container_MainContent']/div[@class='plugin-hadesinternal']/div/div/div[@class='jobs-JobsTab-main']/table[@class='jT cSST']/tbody/tr[@class='job-row'][3]/td[@class='candidates']/div/a[1]" selected_option = driver.find_element_by_xpath(xpath)``` But can't open in a new tab or open and return to my previous page or open with BeautifulSoup... – Solal May 12 '19 at 21:10
  • can you share the url? – QHarr May 12 '19 at 21:11
  • It's url after login... It's my personal account. How can I share it with you differently ? – Solal May 12 '19 at 21:12
  • no, you shouldn't share your personal details. – QHarr May 12 '19 at 21:12
  • Ok @supputuri solution worked with xpath. I can open the link now the struggle is opening in a new tab but this is another topic. Thanks for your – Solal May 12 '19 at 21:16
  • KunduK has done what I would have done except it looks like relative links that need a domain concatenated as prefix. Then you can loop that list and driver.get each url. – QHarr May 12 '19 at 21:17
  • you can open in a new tab easily with execute_script and pass javascript command for new tab – QHarr May 12 '19 at 21:17
  • Like this: https://stackoverflow.com/a/42417215/6241235 where you concatenate the new url from your list as you loop – QHarr May 12 '19 at 21:18
  • @QHarr when I do: ```selected_option = driver.find_element_by_xpath(xpath) selected_option.click()``` that works fine however ```selected_option = driver.find_element_by_xpath(xpath) selected_option.send_keys(Keys.COMMAND + 't')``` nothing will happen. I don't see how to handle it with `driver.execute_script("window.open('');")` – Solal May 12 '19 at 21:24
  • driver.execute_script("window.open('" + url + "');") – QHarr May 12 '19 at 21:25
  • 1
    Thanks @QHarr. Working well. I just had to add an intermediary step in between. I will show full code so it's clearer ```selected_option = driver.find_element_by_xpath(xpath) url = selected_option.get_attribute("href") driver.execute_script("window.open('" + url + "');") ``` – Solal May 12 '19 at 21:32

4 Answers4

2

You can use //a[@data-tn-element = 'view-unread-candidates'], which will list all unread candidates.

If you want a specific candidate by candidate id then use the following xpath. And set the candidateId with the desired id.

candidateId = 'a6b0e325a499'
"//a[@data-tn-element = 'view-unread-candidates'][contains(@href,'id=" + candidateId + "')]"
supputuri
  • 13,644
  • 2
  • 21
  • 39
  • There seems to be something off in your quotes, as `candidateId` is a different color. I think the python interpreter might get confused by it, but I'm not sure I haven't tested it – Reedinationer May 10 '19 at 22:52
  • its the variable holding the candidate id. @Reedinationer updated the answer for clear understanding. – supputuri May 10 '19 at 23:07
  • Oh durrr it's just a concatenation. I usually use `"stuff {} more stuff".format(variable)` so it threw me off! Yeah, that would work I think. Not sure if OP has the `candidateId` beforehand though. – Reedinationer May 10 '19 at 23:10
1

I would use

elem = driver.find_element_by_class_name("jobs-u-font--bold")

To get the <span>, since that seems like a unique class name (although I can't be sure from your post). Then you can reach the <a> level with

a_elem = elem.find_element_by_xpath("..")

Then you can a_elem.click() or whatever you are trying to do.

Reedinationer
  • 5,661
  • 1
  • 12
  • 33
  • Hey, thanks for the reply. `elem = driver.find_elements_by_class_name("jobs-u-font--bold")` correctly returns a list of elements. I itterate in a loop and tried ```for e in elements: a_elem = e.parent a_elem.click() ``` it returns: `AttributeError: 'WebDriver' object has no attribute 'click'` – Solal May 10 '19 at 22:00
  • 1
    @Solal Yeah, I must have gotten confused with beautifulsoup. After referencing [this post](https://stackoverflow.com/questions/18079765/how-to-find-parent-elements-by-python-webdriver) I think you can leverage the same technique – Reedinationer May 10 '19 at 22:06
  • Thanks I am using xpath and an extension Xpath Helper to grep the Xpath :) Working well – Solal May 10 '19 at 22:25
1

To access the anchor tag you can use css selector with attribute data-tn-element="view-unread-candidates" i believe it should be same for all anchor tag.

elements=driver.find_elements_by_css_selector('a[data-tn-element="view-unread-candidates"]')
for ele in elements:
    print(ele.get_attribute("href"))

Or if you want to use child element and then want to fetch the parent tag then try below code with xpath.

elements=driver.find_elements_by_xpath("//span[@class='jobs-u-font--bold']")
for ele in elements:
    print(ele.find_element_by_xpath("./parent::a").get_attribute('href'))
KunduK
  • 32,888
  • 5
  • 17
  • 41
0

I would use:

List elements = driver.findElements(By.xpath("//a[@data-tn-element='view-unread-candidates']"));

    Iterator<WebElement> iter = elements.iterator();

    while (iter.hasNext()) {
        WebElement item = iter.next();
        String href = item.getAttribute("href");
        System.out.println("href is " + href);
    }
}

And if you want to click the link with the particular href, then you can put the if condition after getting the href in the above code . When that condition will meet, click on the element.