I am a newbie to the web-scraping. Pardon my silly mistakes if there are any.
I have been working on a project in which I need a list of movies as my data. I am trying to collect the data from the wikipedia using web-scraping.
Following is my code for the same:
def MoviesList(years, driver):
for year in years:
driver.implicitly_wait(150)
year.click()
table = driver.find_element_by_xpath('/html/body/div[3]/div[3]/div[5]/div[1]/table[2]/tbody')
movies = table.find_elements_by_xpath('tr/td[1]/i/a')
for movie in movies:
print(movie.text)
driver.back()
years = driver.find_elements_by_partial_link_text('List of Bollywood films of')
del years[:2]
MoviesList(years, driver)
Trying to get the years list from this page and stored it in the years
variable. Then, I am looping through all the years and trying to extract the top-10 movies of the year. see this for reference
Output:
Tanhaji
Baaghi 3
...
...
Panga
# Top movies of the year 2020
selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document (from line year.click())
Expected Output:
Tanhaji
...
...
War # First movie of the year 2019
Saaho
...
...
Vikram Urvashi # Last movie of the year 1920
# Top movies of the year from 2020 to 1920
I have already referred this and this questions but it goes in vain. I have tried Explicit Wait too, but it didn't work.
I am aware of the error that when it occurs but I don't know how to handle that error other than adding implicit or explicit wait.
What am I doing wrong? How can I improve this code to get the desired output?
Any help would be much appreciated.