0

I wanna get content inside from all links with id = "LinkNoticia" Actually my code join in first link and extract content, but i cant access to other.

How can i do it?

this is my code (its works for 1 link)

from selenium import webdriver

driver= webdriver.Chrome("/selenium/webdriver/chromedriver")
driver.get('http://www.emol.com/noticias/economia/todas.aspx')

driver.find_element_by_id("LinkNoticia").click()

title = driver.find_element_by_id("cuDetalle_cuTitular_tituloNoticia")
print(title.text)
timbre timbre
  • 12,648
  • 10
  • 46
  • 77
Raul Escalona
  • 117
  • 1
  • 10

1 Answers1

0

First of all, the fact that page has multiple elements with the same ID is a bug on its own. The whole point of ID is to be unique for each element on the page. According to HTML specs:

id = name This attribute assigns a name to an element. This name must be unique in a document.

A lengthy discussion is here.

Since ID is supposed to be unique, most (all?) implementations of Selenium will only have function to look for one element with given ID (e.g. find_element_by_id). I have never seen a function for finding multiple elements by ID. So you cannot use ID as your locator directly, you need to use one of the existing functions that allows location of multiple elements, and use ID as just some attribute which allows you to select a group of elements. Your choices are:

find_elements_by_xpath
find_elements_by_css_selector

For example, you could change your search like this:

links = driver.find_elements_by_xpath("//a[@id='LinkNoticia']");

That would give you the whole set of links, and you'd need to loop through them to retrieve the actual link (href). Note that if you just click on each link, you navigate away from the page and references in links will no longer be valid. So instead you can do this:

  1. Build list of hrefs from the links:

    hrefs=[]
    for link in links:
        hrefs.append(link.get_attribute("href"))
    
  2. Navigate to eachhref to check its title:

    for href in hrefs:
        driver.get(href);
        title = driver.find_element_by_id("cuDetalle_cuTitular_tituloNoticia")
        # etc
    
timbre timbre
  • 12,648
  • 10
  • 46
  • 77