0

Sorry for being a bit vague in the title, I don't know the right terminology.

I would like to select the text that follows the highlighted '<source srcset' tag in the below screenshot:

inspect source

It doesn't work if I try to select the class css-1nfcn93, and I don't know how to select on anything deeper into the tree. I assume I want xpath, but I don't understand how that works yet.

Some code that does not work ('WebElement has no len()'):

d = webdriver.Chrome()
d.get('https://www.theguardian.com/football/2020/dec/19/burnley-wolves-match-preview-premier-league')
text = d.find_element_by_css_selector('div.css-1nfcn93')
print(len(text))
d.quit()
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
Andrew
  • 21
  • 3
  • 1
    The term is `attribute`, see [this](https://stackoverflow.com/questions/30324760/how-to-get-attribute-of-element-from-selenium) post or [this blog](https://www.browserstack.com/guide/getattribute-method-in-selenium). – Thymen Dec 21 '20 at 20:18

2 Answers2

1

To get all source elements use:

source_elements = driver.find_elements_by_css_selector('.css-1nfcn93 source')

This will give you a list of sources. From there you can get the content of the attribute you are looking for

for source in source_elements:
    print(len(source.get_attribute('srcset'))
Nic Laforge
  • 1,776
  • 1
  • 8
  • 14
1

The text Burnley v Wolves is the value of the alt attribute and to extract it you have to:

  • Induce WebDriverWait for the desired frame to be available and switch to it.

  • Induce WebDriverWait for the I'm happy element to be clickable and click on it.

  • Switch to the Default Content

  • Induce WebDriverWait for the desired element to be clickable.

  • You can use either of the following Locator Strategies:

    • Using CSS_SELECTOR:

      driver.get("https://www.theguardian.com/football/2020/dec/19/burnley-wolves-match-preview-premier-league")
      WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe[id^='sp_message_iframe']")))
      WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[title$='m happy']"))).click()
      driver.switch_to.default_content()
      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "picture[itemprop='contentUrl'] img"))).get_attribute("alt"))
      
    • Using XPATH:

      driver.get("https://www.theguardian.com/football/2020/dec/19/burnley-wolves-match-preview-premier-league")
      WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[starts-with(@id, 'sp_message_iframe')]")))
      WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[contains(@title, 'm happy')]"))).click()
      driver.switch_to.default_content()
      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//picture[@itemprop='contentUrl']//img"))).get_attribute("alt"))
      
  • Console Output:

    Burnley v Wolves
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

Reference

You can find a couple of relevant discussions in:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352