-1

So I'm scrapping TripAdvisor's reviews page for individual hotels, trying to get the number of stars for said hotel.

TripAdvisor hotel review page

The HTML for the div where this info is included is this (I don't copy the HTML for the whole page because it's really long and I don't think it's necessary):

<div class="drcGn _R MC S4 _a H">
  <span class="fyiTP S2">
    <svg class="TkRkB d H0" viewBox="0 0 120 24" width="80" height="16" aria-label="4.0 of 5 bubbles">
            <path d="M23.37 8.518l-6.587-1.68a1.716 1.716 0 01-.479-.36L12.712.357C12.478.121 12.235.005 12 0c-.235.005-.478.121-.712.356L7.696 6.478c-.121.12-.24.238-.479.36L.63 8.518c-.479.12-.719.601-.6.96 0 .12.121.239.121.239l4.429 5.521c.121.239.241.481.241.601l-.479 7.32c0 .484.36.841.719.841.121 0 .24 0 .36-.119l6.229-2.642c.114 0 .232-.112.351-.117.118.005.236.117.351.117l6.229 2.642c.118.119.237.119.358.119.359 0 .719-.357.719-.841l-.479-7.32c0-.119.12-.361.241-.601l4.429-5.521s.121-.119.121-.239c.12-.36-.121-.84-.6-.96zm24 0l-6.587-1.68a1.716 1.716 0 01-.479-.36L36.712.357C36.478.121 36.235.005 36 0c-.235.005-.478.121-.712.356l-3.592 6.122c-.121.12-.24.238-.479.36l-6.587 1.68c-.479.12-.719.601-.6.96 0 .12.121.239.121.239l4.429 5.521c.121.239.241.481.241.601l-.479 7.32c0 .484.36.841.719.841.121 0 .24 0 .36-.119l6.229-2.642c.114 0 .232-.112.351-.117.118.005.236.117.351.117l6.229 2.642c.118.119.237.119.358.119.359 0 .719-.357.719-.841l-.479-7.32c0-.119.12-.361.241-.601l4.429-5.521s.121-.119.121-.239c.12-.36-.121-.84-.6-.96zm24 0l-6.587-1.68a1.716 1.716 0 01-.479-.36L60.712.357C60.478.121 60.235.005 60 0c-.235.005-.478.121-.712.356l-3.592 6.122c-.121.12-.24.238-.479.36l-6.587 1.68c-.479.12-.719.601-.6.96 0 .12.121.239.121.239l4.429 5.521c.121.239.241.481.241.601l-.479 7.32c0 .484.36.841.719.841.121 0 .24 0 .36-.119l6.229-2.642c.114 0 .232-.112.351-.117.118.005.236.117.351.117l6.229 2.642c.118.119.237.119.358.119.359 0 .719-.357.719-.841l-.479-7.32c0-.119.12-.361.241-.601l4.429-5.521s.121-.119.121-.239c.12-.36-.121-.84-.6-.96zm24 0l-6.587-1.68a1.716 1.716 0 01-.479-.36L84.712.357C84.478.121 84.235.005 84 0c-.235.005-.478.121-.712.356l-3.592 6.122c-.121.12-.24.238-.479.36l-6.587 1.68c-.479.12-.719.601-.6.96 0 .12.121.239.121.239l4.429 5.521c.121.239.241.481.241.601l-.479 7.32c0 .484.36.841.719.841.121 0 .24 0 .36-.119l6.229-2.642c.114 0 .232-.112.351-.117.118.005.236.117.351.117l6.229 2.642c.118.119.237.119.358.119.359 0 .719-.357.719-.841l-.479-7.32c0-.119.12-.361.241-.601l4.429-5.521s.121-.119.121-.239c.12-.36-.121-.84-.6-.96zm24 0l-6.587-1.68a1.716 1.716 0 01-.479-.36L108.712.357C108.478.121 108.235.005 108 0c-.235.005-.478.121-.712.356l-3.592 6.122c-.121.12-.24.238-.479.36l-6.587 1.68c-.479.12-.719.601-.6.96 0 .12.121.239.121.239l4.429 5.521c.121.239.241.481.241.601l-.479 7.32c0 .484.36.841.719.841.121 0 .24 0 .36-.119l6.229-2.642c.114 0 .232-.112.351-.117.118.005.236.117.351.117l6.229 2.642c.118.119.237.119.358.119.359 0 .719-.357.719-.841l-.479-7.32c0-.119.12-.361.241-.601l4.429-5.521s.121-.119.121-.239c.12-.36-.121-.84-.6-.96z"></path><path class="cenTm" d="M23.37 8.518l-6.587-1.68a1.716 1.716 0 01-.479-.36L12.712.357C12.478.121 12.235.005 12 0c-.235.005-.478.121-.712.356L7.696 6.478c-.121.12-.24.238-.479.36L.63 8.518c-.479.12-.719.601-.6.96 0 .12.121.239.121.239l4.429 5.521c.121.239.241.481.241.601l-.479 7.32c0 .484.36.841.719.841.121 0 .24 0 .36-.119l6.229-2.642c.114 0 .232-.112.351-.117.118.005.236.117.351.117l6.229 2.642c.118.119.237.119.358.119.359 0 .719-.357.719-.841l-.479-7.32c0-.119.12-.361.241-.601l4.429-5.521s.121-.119.121-.239c.12-.36-.121-.84-.6-.96zm24 0l-6.587-1.68a1.716 1.716 0 01-.479-.36L36.712.357C36.478.121 36.235.005 36 0c-.235.005-.478.121-.712.356l-3.592 6.122c-.121.12-.24.238-.479.36l-6.587 1.68c-.479.12-.719.601-.6.96 0 .12.121.239.121.239l4.429 5.521c.121.239.241.481.241.601l-.479 7.32c0 .484.36.841.719.841.121 0 .24 0 .36-.119l6.229-2.642c.114 0 .232-.112.351-.117.118.005.236.117.351.117l6.229 2.642c.118.119.237.119.358.119.359 0 .719-.357.719-.841l-.479-7.32c0-.119.12-.361.241-.601l4.429-5.521s.121-.119.121-.239c.12-.36-.121-.84-.6-.96zm24 0l-6.587-1.68a1.716 1.716 0 01-.479-.36L60.712.357C60.478.121 60.235.005 60 0c-.235.005-.478.121-.712.356l-3.592 6.122c-.121.12-.24.238-.479.36l-6.587 1.68c-.479.12-.719.601-.6.96 0 .12.121.239.121.239l4.429 5.521c.121.239.241.481.241.601l-.479 7.32c0 .484.36.841.719.841.121 0 .24 0 .36-.119l6.229-2.642c.114 0 .232-.112.351-.117.118.005.236.117.351.117l6.229 2.642c.118.119.237.119.358.119.359 0 .719-.357.719-.841l-.479-7.32c0-.119.12-.361.241-.601l4.429-5.521s.121-.119.121-.239c.12-.36-.121-.84-.6-.96zm24 0l-6.587-1.68a1.716 1.716 0 01-.479-.36L84.712.357C84.478.121 84.235.005 84 0c-.235.005-.478.121-.712.356l-3.592 6.122c-.121.12-.24.238-.479.36l-6.587 1.68c-.479.12-.719.601-.6.96 0 .12.121.239.121.239l4.429 5.521c.121.239.241.481.241.601l-.479 7.32c0 .484.36.841.719.841.121 0 .24 0 .36-.119l6.229-2.642c.114 0 .232-.112.351-.117.118.005.236.117.351.117l6.229 2.642c.118.119.237.119.358.119.359 0 .719-.357.719-.841l-.479-7.32c0-.119.12-.361.241-.601l4.429-5.521s.121-.119.121-.239c.12-.36-.121-.84-.6-.96z"></path>
    </svg>
  </span>
</div>

I'm getting the stars rating from the aria-label attribute of the svg element, so what I do is look for the svg element and then get the value for the aria-label attribute.

My questions is that if I look for the element using the find_elements_by_class_name method it works and I get the result I'm looking for:

stars_info = driver.find_elements_by_class_name("TkRkB.d.H0")

However, if I try to find the same element using the find_element_by_xpath method, I get a NoSuchElementException:

stars_info = driver.find_element_by_xpath('//svg[@class="TkRkB.d.H0"]')

Why does this happen? Before anyone says anything, I have tried the xpath expression with different formats (TkRkB d H0 instead of TkRkB.d.H0, switching the order of single quotes and double quotes", etc.)

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352

1 Answers1

1

By.CLASS_NAME accepts a single classname. So sending multiple classnames through By.CLASS_NAME may not be a part of the best practices.

The classnames are considered in the following fashion:

  • Within xpath: Similar to the HTML DOM, i.e. seperated by a single space character, example:

    //tagname[@class='TkRkB d H0']
    
  • Within css_selector: The classnames are seperated by the dot . character, example:

    tagname.TkRkB.d.H0
    

Solution

To print the value of the aria-label attribute you can use either of the following locator strategies:

  • Using css_selector:

    print(driver.find_element_by_css_selector("svg.TkRkB.d.H0").get_attribute("aria-label"))
    
  • Using xpath:

    print(driver.find_element_by_xpath("//*[name()='svg' and @class='TkRkB d H0']").get_attribute("aria-label"))
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352