0

My goal is to access the IMDB site from a different website and try to get movie genres.

This is the main page: driver.get("https://sfy.ru/scripts") # main website

NOTE: You might get a "certificate is not valid error". Please set your computer date:07/12/2020

Here is my code:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()

driver.get("https://sfy.ru/scripts") #main site

i = 4
driver.find_element_by_xpath("/html/body/div/div[2]/div[1]/p[{}]/a".format(i)).click()

driver.find_element_by_link_text("More info about this movie on IMDb.com").click()


genres = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//h4[.='Genres:']/following-sibling::a")))

aa = (", ".join([genre.text for genre in genres]))

print(aa)

ERROR:

---------------------------------------------------------------------------
TimeoutException                          Traceback (most recent call last)
<ipython-input-13-e29dcd0a9a9b> in <module>
     13 
     14 
---> 15 genres = WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//h4[.='Genres:']/following-sibling::a")))
     16 
     17 aa = (", ".join([genre.text for genre in genres]))

~\Anaconda3\lib\site-packages\selenium\webdriver\support\wait.py in until(self, method, message)
     78             if time.time() > end_time:
     79                 break
---> 80         raise TimeoutException(message, screen, stacktrace)
     81 
     82     def until_not(self, method, message=''):

TimeoutException: Message: 

The purpose of using this code is:

i = 4
driver.find_element_by_xpath("/html/body/div/div[2]/div[1]/p[{}]/a".format(i)).click()

this page has lists of movies (driver.get("https://sfy.ru/scripts")). With the redirect link here (driver.find_element_by_link_text("More info about this movie on IMDb.com").click()), I try to access IMDB pages and get the genres of each film.

So, I have to create a loop to get the genres of all the movies. That's why I am trying to use that code.

What should I change in my code?

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352

2 Answers2

1

When you click to get more info about the :

driver.find_element_by_link_text("More info about this movie on IMDb.com").click()

It opens a new windows on your Chrome webpage. So your driver have to switch to this new windows. Try this :

driver.switch_to_window(driver.window_handles[1])

The rest of your code should work.

0

Instead of presence_of_all_elements_located() you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using XPATH:

    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//h4[.='Genres:']/following-sibling::a")))])
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352