1

I am using selenium + chrome web driver to load a dynamic page:

self.driver.get(url)
time.sleep(3)  # not sure if I need to add this wait, so the .js loads the page?

Once the page is loaded, I want to get a list of all cards available on the page and then iterate through each card and get its title:

cards = self.driver.find_elements_by_css_selector('div.my-card')
for card in cards:
    title = card.find_element_by_css_selector('h2.title::text').get() # <-- does not work
    desc = card.find_element_by_css_selector('div.desc::text').get() # <-- does not work
    # more fields that I need within this card

find_elements_by_css_selector seems to be a driver method... I am not sure how to apply these selector to card (the type card is WebElement).


Sample page:

<div class='my-card'>
    <h2 class='title'>title 1</h2>
    <div class='desc'>desc 1</div>
</div>
<div class='my-card'>
    <h2 class='title'>title 2</h2>
    <div class='desc'>desc 2</div>
</div>
<div class='my-card'>
    <h2 class='title'>title 3</h2>
    <div class='desc'>desc 3</div>
</div>
Hooman Bahreini
  • 14,480
  • 11
  • 70
  • 137
  • Can you provide the output of cards or the html of a card please. That would help. – HedgeHog Nov 18 '20 at 06:14
  • 1
    You need a webdriver wait after driver.get to allow for page load. – Arundeep Chohan Nov 18 '20 at 06:26
  • @arundeepchohan: it's not just the title... there are multiple fields within the card that I need... that's why I am iterating the card... is there no way to use a css selector on the card element? – Hooman Bahreini Nov 18 '20 at 06:40
  • They are ways to go down a card element but we need the html element layout in order to do so. – Arundeep Chohan Nov 18 '20 at 06:41
  • @arundeepchohan: I have added a sample html... as I mentioned in the question, there are more fields... I am just showing title and description – Hooman Bahreini Nov 18 '20 at 06:45
  • That was a typo... I removed it – Hooman Bahreini Nov 18 '20 at 06:50
  • Currently what errors do you come up with once you add the wait and print(title.text)? – Arundeep Chohan Nov 18 '20 at 06:51
  • What you have there already actually works for me on the example you gave... `card.find_element_by_css_selector('h2.title')` returns the h2 element... – sytech Nov 18 '20 at 06:57
  • @sytech: thanks a lot... I updated the question... I have `title = card.find_element_by_css_selector('h2.title::text').get()` which I believe is the incorrect syntax (I mistakenly used the Scrapy syntax)... I want to get the text inside `h2` – Hooman Bahreini Nov 18 '20 at 07:04
  • 1
    I don't think the pseudo class selector `::text` works here. Omit the `::text` part of the selector to get the element. Then, you can use the `.text` attribute on the webelement object to retrieve the text. e.g. `title = card.find_element_by_css_selector('h2.title").text` – sytech Nov 18 '20 at 07:06

2 Answers2

3

You can combine the two into one query (CSS) selector like so by using the combinator child or descendent selector.

If the h2 element is a child and a descendent of the card element:

self.driver.find_elements_by_css_selector('div.my-card > h2.title')

If the h2 element is only a descendent of the card element:

self.driver.find_elements_by_css_selector('div.my-card h2.title')

Find out more about CSS combinators here.

idontknow
  • 438
  • 5
  • 16
  • This returns a list of webelements with a div class of my-card and a descendant of h2 class of title. – Arundeep Chohan Nov 18 '20 at 06:34
  • @HoomanBahreini perhaps something is missing from your example? The answer given here returns a list of all three `h2` elements containing the title. Or is there something else missing from this answer? – sytech Nov 18 '20 at 06:54
  • @HoomanBahreini Yeah, it returns a `list[WebElement]`, where each of the elements in the list is the title of a card. – idontknow Nov 18 '20 at 15:54
1

What you have works as-is. Using your given example HTML:

In [19]: for card in cards:
    ...:     title_elem = card.find_element_by_css_selector('h2.title')
    ...:     print(title_elem.text)
    ...:
title 1
title 2
title 3

In [20]: card
Out[20]: <selenium.webdriver.remote.webelement.WebElement (session="534c4be3a233a0aa963f541550ac7861", element="b56dcaf3-9b38-4e4c-aae8-99005ac9840b")>

So. Your expectation of using nested selenium selectors is correct. Some other assumption must be throwing you off.

sytech
  • 29,298
  • 3
  • 45
  • 86
  • Thanks a lot... I updated the question.. I guess the problem was using the Scrapy syntax for getting the text inside h2 element... do you know how can I get the inner html of `h2`? – Hooman Bahreini Nov 18 '20 at 07:06
  • 1
    @HoomanBahreini in the example above you could do `title_elem.get_attribute('innerHTML')` -- [see also](https://stackoverflow.com/questions/40416048/difference-between-text-and-innerhtml-using-selenium) – sytech Nov 18 '20 at 07:09