I have a list of websites that I want to loop through and extract the Genres of the films. They all come from boxofficemojo.
An example link is the following: https://www.boxofficemojo.com/release/rl3829564929/
In the inspect, the structure of the page for the section that I want to extract is like this:
<div class = "a-section a-spacing-none">
<span>Genres</span>
</div>
<span>
Action Adventure Thriller
</span>
When I run the following code:
driver = webdriver.Chrome("C:\SeleniumDrivers\chromedriver.exe")
driver.get("https://www.boxofficemojo.com/release/rl3829564929/")
driver.implicitly_wait(3)
my_element = driver.find_element_by_xpath("/html/body/div[1]/main/div/div[3]/div[4]/div[7]/span[2]")
my_element.text
I get the following results:
'Action Adventure Thriller'
which is the desirable result for this particular movie. However,when I go to other movies, the xpath is different and I cannot access it automatically.
The ideal solution would be to loop through the websites and extract the genres of the films irrespective of the xpath that the Genres has in each individual film page.