python - selenium scraping rotten tomatoes for audience score

Question

I'm trying to scrape the audience score from rotten tomatoes. I was able to get reviews but not sure how use selenium to get the "audiencescore"

Source:

<score-board
audiencestate="upright"
audiencescore="96"
class="scoreboard"
rating="R"
skeleton="panel"
tomatometerstate="certified-fresh"
tomatometerscore="92"
data-qa="score-panel"
                >
<h1 slot="title" class="scoreboard__title" data-qa="score-panel-movie-title">Pulp Fiction</h1>
<p slot="info" class="scoreboard__info">1994, Crime/Drama, 2h 33m</p>
<a slot="critics-count" href="/m/pulp_fiction/reviews?intcmp=rt-scorecard_tomatometer-reviews" class="scoreboard__link scoreboard__link--tomatometer" data-qa="tomatometer-review-count">110 Reviews</a>
<a slot="audience-count" href="/m/pulp_fiction/reviews?type=user&amp;intcmp=rt-scorecard_audience-score-reviews" class="scoreboard__link scoreboard__link--audience" data-qa="audience-rating-count">250,000+ Ratings</a>
<div slot="sponsorship" id="tomatometer_sponsorship_ad"></div>
                </score-board>

Code:

from selenium import webdriver

driver = webdriver.Firefox()
url = 'https://www.rottentomatoes.com/m/pulp_fiction'
driver.get(url)

print(driver.find_element_by_css_selector('a[slot=audience-count]').text)

Md. Fazlul Hoque · Answer 1 · 2022-06-10T12:03:37.270

1

The attribute value of audiencescore which is not any text nodes value that's why we can't invoke .text method to grab that value. So you have to call get_attribute() after selecting the right locator. The following expression is working.

print(driver.find_element(By.CSS_SELECTOR,'#topSection score-board').get_attribute('audiencescore'))

#import

from selenium.webdriver.common.by import By

edited Jun 10 '22 at 12:03

answered Jun 10 '22 at 11:29

Md. Fazlul Hoque

15,806
5
12
32

Wonka · Answer 2 · 2022-06-10T11:11:41.227

0

Try this:

1- Get element score-board

2- Get audiencescore attribute from element

audiencescore = driver.find_element_by_css_selector('score-board').get_attribute('audiencescore')

edited Jun 10 '22 at 11:11

answered Jun 10 '22 at 10:50

Wonka

1,548
1
13
20

undetected Selenium · Accepted Answer · 2022-06-10T11:46:27.270

You were close enough. To extract the value of the audiencescore attribute i.e. the text 96 ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

Using CSS_SELECTOR:

driver.get("https://www.rottentomatoes.com/m/pulp_fiction")
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "score-board.scoreboard"))).get_attribute("audiencescore"))

Using XPATH:

driver.get("https://www.rottentomatoes.com/m/pulp_fiction")
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//score-board[@class='scoreboard']"))).get_attribute("audiencescore"))

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Console Output:
```
96
```

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

python - selenium scraping rotten tomatoes for audience score

3 Answers3