How do I obtain the text inside a label which is inside a span using Selenium with Python?

Question

A block of code on a website I'd like to use Selenium ( with Python ) on ( for web scraping ) looks like the following -

<div class="exp_date">
  <span class="uppr_sec">
    <i class="exp_clndr"></i>
    <label> 04 Jan 2021 09:30 AM - 04 Jan 2021 10:30 AM </label>
  </span>
  
  <br>
  
  <div class="clear"></div>
  
  <span class="lwr_sec">
    <i class></i>
    <label>Hosted By Some Random Person</label>
  </span>

</div>

I'd like to print the text enclosed in the <label> tags in both the spans i.e. "04 Jan 2021 09:30 AM - 04 Jan 2021 10:30 AM" and "Hosted By Some Random Person" in the Python console, using Selenium. However, I am not sure about the steps to do so, because the labels are nested in their respective spans, which are nested in a div.

Can someone please help me out with the code needed to do so ? (in Python)

Have you tried anything? Any issues faced? Also is the usage of Selenium compulsory due to page being dynamic? — Sin Han Jinn, Jan 04 '21 at 08:11

score 1 · Accepted Answer · answered Jan 04 '21 at 08:32

To extract and print the texts e.g. 04 Jan 2021 09:30 AM - 04 Jan 2021 10:30 AM using Selenium and python you can use either of the following Locator Strategies:

Using css_selector and get_attribute("innerHTML"):

print([my_elem.get_attribute("innerHTML") for my_elem in driver.find_elements_by_css_selector("div.exp_date > span.uppr_sec label")])

Using xpath and text attribute:

print([my_elem.text for my_elem in driver.find_elements_by_xpath("//div[@class='exp_date']/span[@class='uppr_sec']//label")])

Ideally you need to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

Using CSS_SELECTOR and get_attribute("innerHTML"):

print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.exp_date > span.uppr_sec label")))])

Using XPATH and text attribute:

print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='exp_date']/span[@class='uppr_sec']//label")))])

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Outro

Link to useful documentation:

get_attribute() method Gets the given attribute or property of the element.
text attribute returns The text of the element.
Difference between text and innerHTML using Selenium

This is just everything I needed, thanks a ton! I've got another query though - I also have to identify a link starting with "abc.com" from the source HTML code of a given webpage, for which I'm planning to use `wd.page_source` (wd is the webdriver object ) to obtain the HTML source code of the page, and then, use Python RegEx to search for a string starting with "abc.com". Is there a Selenium - specific workaround to this, without using RegEx ( something like a search mechanism ) ? Thanks in advance! — Pranav N, Jan 04 '21 at 10:12
@PranavN Sounds like a completely different issue all together. Can you raise a new question as per your new requirement? Stackoverflow contributors will be happy to help you out. — undetected Selenium, Jan 04 '21 at 10:14

How do I obtain the text inside a label which is inside a span using Selenium with Python?

1 Answers1

Outro