1

I am making a reddit bot that will look for certain attributes in comments, use selenium to visit the information website, and use driver.find_element_by... to get the value inside that tag, but it is not working.

When I use driver.find_element_by_class_name(), this is the data returned:

<selenium.webdriver.remote.webelement.WebElement (session="f454dcf92728b9db4de080a27a844bf7", element="514bd57d-99d7-4fce-a05d-3fa92f66ad49")>

when I use driver.find_elements_by_css_selector(".style-scope.ytd-video-renderer"), this is returned:

[
  <selenium.webdriver.remote.webelement.WebElement (session="43cb953cde81df270260bf769fe081a2", element="6b4ee3e2-5e6b-48e2-8ec8-9083bf15baea")>, 
  <selenium.webdriver.remote.webelement.WebElement (session="43cb953cde81df270260bf769fe081a2", ...
]

when I use driver.find_elements_by_css_selector(".style-scope.ytd-video-renderer").

Suppose that this is what I'm trying to locate (The above code returned the above Selenium data for this tag):

<yt-formatted-string class="style-scope ytd-video-renderer" aria-label="Sword Art Online: Alicization Lycoris Opening Full『ReoNa - Scar/let』 by Melodic Star 2 months ago 4 minutes, 18 seconds 837,676 views">Sword Art Online: Alicization Lycoris Opening Full『ReoNa - Scar/let』</yt-formatted-string>

What I want

I want Sword Art Online: Alicization Lycoris Opening Full『ReoNa - Scar/let』 returned.

What could I do?

2 Answers2

4

Use .text:

element = driver.find_element_by_xpath('//*[@id="container"]/h1/yt-formatted-string')
print(element.text)
frianH
  • 7,295
  • 6
  • 20
  • 45
0

Seems you were pretty close enough. When you use driver.find_element_by_class_name() the first matching WebElement is returned. On printing the same, the output is:

<selenium.webdriver.remote.webelement.WebElement (session="f454dcf92728b9db4de080a27a844bf7", element="514bd57d-99d7-4fce-a05d-3fa92f66ad49")>

which represents the WebElement itself, which possibly contains the desired text.

On similar lines driver.find_elements_by_css_selector(".style-scope.ytd-video-renderer") returns a list of matching WebElements and on printing those, the output is:

[
  <selenium.webdriver.remote.webelement.WebElement (session="43cb953cde81df270260bf769fe081a2", element="6b4ee3e2-5e6b-48e2-8ec8-9083bf15baea")>, 
  <selenium.webdriver.remote.webelement.WebElement (session="43cb953cde81df270260bf769fe081a2",
  ...
]

Solution

To extract the text Sword Art Online: Alicization Lycoris Opening Full『ReoNa - Scar/let』 from the following HTML:

<yt-formatted-string class="style-scope ytd-video-renderer" aria-label="Sword Art Online: Alicization Lycoris Opening Full『ReoNa - Scar/let』 by Melodic Star 2 months ago 4 minutes, 18 seconds 837,676 views">Sword Art Online: Alicization Lycoris Opening Full『ReoNa - Scar/let』</yt-formatted-string>

You can use either of the following Locator Strategies:

  • Using css_selector and get_attribute():

    print(driver.find_element_by_css_selector("yt-formatted-string.style-scope.ytd-video-renderer").get_attribute("innerHTML"))
    
  • Using xpath and text attribute:

    print(driver.find_element_by_xpath("//yt-formatted-string[@class='style-scope ytd-video-renderer']").text)
    

Ideally, to print the text 3,862.76 you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and get_attribute():

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "yt-formatted-string.style-scope.ytd-video-renderer"))).get_attribute("innerHTML"))
    
  • Using XPATH and text attribute:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//yt-formatted-string[@class='style-scope ytd-video-renderer']"))).text)
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python


Outro

Link to useful documentation:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Thanks, just want you to dwell on that a bit more - suppose that I have multiple ` Lorem Ipsum >`, and I use the `driver.find_elements_by...` instead of the `driver.find_element_by...`, will it return the text as one long string or will it have a new line for each attribute (I'm storing this data and having it reply to a Reddit comment via PRAW, so if I use `comment.reply("tags: {}".format(tags))`, will it just put all the tags together in one string with no spaces or will it give a space between each tag? – KazutoKiritoKirigaya Sep 26 '20 at 17:55
  • @KazutoKiritoKirigaya `driver.find_elements` will always return a list. You have to iterate the list to extract the text. – undetected Selenium Sep 26 '20 at 17:57
  • That's the thing, it says that `driver.find_elements_by...` is not iterable. – KazutoKiritoKirigaya Sep 26 '20 at 17:58
  • @KazutoKiritoKirigaya True, but we can offer you an optimal solution too :) Feel free to raise a new question as per your new requirement. StackOverflow contributers will be happy to help you out. – undetected Selenium Sep 26 '20 at 18:00
  • Is there no obvious alternative solution to this problem so that I can extract the text from the `driver.find_elements_by...`? – KazutoKiritoKirigaya Sep 26 '20 at 18:12
  • @KazutoKiritoKirigaya There are solutions for that as well. But as the context is different so I'm suggesting you to raise a new question so this question and the new one both are helpful for the future readers. – undetected Selenium Sep 26 '20 at 18:15
  • Done, hope you answer there! Found your answer to this quite helpful. – KazutoKiritoKirigaya Sep 26 '20 at 18:50