0

I tried to automate Amazon/novels list page using selenium. It is working sometimes and not working sometimes. I am unable to understand the mistake in the code. It worked fine for some time and scrolled 13 pages out of 20. But from next time, it is not working properly. Till now it didn't scroll complete 20 pages.

from selenium import webdriver
from time import sleep
from bs4 import BeautifulSoup


class App:

    def __init__(self,path='F:\Imaging'):

        self.path=path
        self.driver = webdriver.Chrome('F:\chromedriver')
        self.driver.get('https://www.amazon.in/s/ref=sr_pg_1?rh=i%3Aaps%2Ck%3Anovels&keywords=novels&ie=UTF8&qid=1510727563')
        sleep(1)
        self.scroll_down()
        self.driver.close()

    def scroll_down(self):

        self.driver.execute_script("window.scrollTo(0,5500);")
        sleep(1)
        load_more = self.driver.find_element_by_xpath('//span[@class="pagnRA"]/a[@title="Next Page"]')
        load_more.click()
        sleep(2)

        for value in range(2,19):

            print(self.driver.current_url)
            sleep(3)
            self.driver.execute_script("window.scrollTo(0,5500);")
            sleep(2)
            load_more = self.driver.find_element_by_xpath('//span[@class="pagnRA"]/a[@title="Next Page"]')
            load_more.click()

        sleep(3)


if __name__=='__main__':
    app=App()

The output for this code which i am getting is:

    C:\Users\Akhil\AppData\Local\Programs\Python\Python36-32\python.exe C:/Users/Akhil/Scrape/amazon.py
https://www.amazon.in/s/ref=sr_pg_2/257-8503487-3570721?rh=i%3Aaps%2Ck%3Anovels&page=2&keywords=novels&ie=UTF8&qid=1510744188
https://www.amazon.in/s/ref=sr_pg_3?rh=i%3Aaps%2Ck%3Anovels&page=3&keywords=novels&ie=UTF8&qid=1510744197
https://www.amazon.in/s/ref=sr_pg_4?rh=i%3Aaps%2Ck%3Anovels&page=4&keywords=novels&ie=UTF8&qid=1510744204
Traceback (most recent call last):
  File "C:/Users/Akhil/Scrape/amazon.py", line 31, in <module>
    app=App()
  File "C:/Users/Akhil/Scrape/amazon.py", line 11, in __init__
    self.scroll_down()
  File "C:/Users/Akhil/Scrape/amazon.py", line 26, in scroll_down
    load_more.click()
  File "C:\Users\Akhil\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webelement.py", line 80, in click
    self._execute(Command.CLICK_ELEMENT)
  File "C:\Users\Akhil\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webelement.py", line 501, in _execute
    return self._parent.execute(command, params)
  File "C:\Users\Akhil\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 308, in execute
    self.error_handler.check_response(response)
  File "C:\Users\Akhil\AppData\Local\Programs\Python\Python36-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 194, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: Element <a title="Next Page" id="pagnNextLink" class="pagnNext" href="/gp/search/ref=sr_pg_5?rh=i%3Aaps%2Ck%3Anovels&amp;page=5&amp;keywords=novels&amp;ie=UTF8&amp;qid=1510744210">...</a> is not clickable at point (809, 8). Other element would receive the click: <a href="/gp/prime/ref=nav_prime_try_btn/257-8503487-3570721" class="nav-a nav-a-2" data-ux-mouseover="true" id="nav-link-prime" tabindex="26">...</a>
  (Session info: chrome=62.0.3202.94)
  (Driver info: chromedriver=2.33.506120 (e3e53437346286c0bc2d2dc9aa4915ba81d9023f),platform=Windows NT 10.0.15063 x86_64)


Process finished with exit code 1

How to solve this problem?

Ratmir Asanov
  • 6,237
  • 5
  • 26
  • 40
  • The problem you were facing is because of two things: one is delay and the other is right selector. No scrolling to the location is needed at all in this very case. If you define these two lines of code in your script you will notice that it will traverse all the pages without any problem. Just run your script putting this suggested lines within `driver.find_element_by_css_selector(".pagnNextArrow").click();time.sleep(3)`. It will lead you to the last page very smoothly. Thanks. – SIM Nov 16 '17 at 13:02

3 Answers3

0

The error is that there is no Next Page element visible or clickable. You can either wait for the presence of this element like this, or put the .click() in a try / exception block to detect when it fails.

It is probably that either your target has legitimately run out of next pages (you have seen them all), or the page is still loading, or the format of the next link has changed.

srowland
  • 1,625
  • 1
  • 12
  • 19
  • can u plz state that explicit wait condition code for my case? Its a bit confusing. –  Nov 15 '17 at 11:24
  • even with that explicit time condition, the same problem is repeating. And the Next Page element is present in all pages till 19th page. I verified that too –  Nov 15 '17 at 11:35
  • It's telling you that another element would receive the click - any chance an advert or similar has popped up over the top? You could always take a screenshot at the point of failure and see what it shows - I find this very helpful when debugging selenium, as it can be quite hard to guess what is really happening. – srowland Nov 15 '17 at 11:54
  • Can u explain it a bit more clearer? –  Nov 15 '17 at 12:03
0

Try the following code:

load_more = ui.WebDriverWait(driver, timeout).until(EC.element_to_be_clickable((By.XPATH, "//span[@class="pagnRA"]/a[@title="Next Page"]")))
driver.execute_script("arguments[0].scrollIntoView(true);", load_more)
load_more.click()

where timeout -- time in seconds for waiting for the element to be clickable.

Also, import the following at the beginning of the script:

from selenium.webdriver.support import ui
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
halfer
  • 19,824
  • 17
  • 99
  • 186
Ratmir Asanov
  • 6,237
  • 5
  • 26
  • 40
  • I did this. But i am getting the same error –  Nov 15 '17 at 13:06
  • @Ranger619, are you have an error on the second click? (into the loop) – Ratmir Asanov Nov 15 '17 at 13:16
  • No but occurring on random pages once in 4th page once in 10th n so on but never scrolled all pages –  Nov 15 '17 at 13:17
  • @Ranger619, check manually on different pages -- is XPath changing or not? Or try to increase "sleep". – Ratmir Asanov Nov 15 '17 at 13:26
  • xpath is not changing. Constant on all pages. But scrolling unevenly though same length parameter specified. Increased sleep also. But same error. Whenever the page is scrolled more from the "Next Page" element, the code is crashing. Is there any condition like scrolling only to the position of the element correctly will allow the code to move to next page? –  Nov 15 '17 at 13:35
  • @Ranger619, I have updated my answer -- try on. – Ratmir Asanov Nov 15 '17 at 13:38
  • Now it is crashing in the 2nd page itself. i am getting an error selenium.common.exceptions.WebDriverException: Message: unknown error: Element ... is not clickable at point (784, 0). Other element would receive the click: (Session info: chrome=62.0.3202.94) (Driver info: chromedriver=2.33.506120 (e3e53437346286c0bc2d2dc9aa4915ba81d9023f),platform=Windows NT 10.0.15063 x86_64) –  Nov 15 '17 at 13:55
  • I got the correct result @RatmirAsanov . I am answering my question with correct code. Thanx for ur help. I made small changes to ur code –  Nov 15 '17 at 14:18
  • @Ranger619, if my answer was helpful -- check a tick around my answer. Thanks. – Ratmir Asanov Nov 15 '17 at 14:19
0

I finally got the correct result using small modifications to the answer given by @RatmirAsanov.

Please see this code. This will scroll all pages without fail.

from selenium import webdriver
from time import sleep
from bs4 import BeautifulSoup
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

class App:
    def __init__(self,path='F:\Imaging'):
        self.path=path
        self.driver = webdriver.Chrome('F:\chromedriver')
        self.driver.get('https://www.amazon.in/s/ref=sr_pg_1?rh=i%3Aaps%2Ck%3Anovels&keywords=novels&ie=UTF8&qid=1510727563')
        sleep(1)
        self.scroll_down()
        self.driver.close()

    def scroll_down(self):
        sleep(3)
        self.driver.execute_script("window.scrollTo(0,5450);")
        sleep(3)
        load_more = self.driver.find_element_by_xpath('//span[@class="pagnRA"]/a[@title="Next Page"]')
        load_more.click()
        sleep(3)
        for value in range(2,19):
            print(self.driver.current_url)
            sleep(5)
            self.driver.execute_script("window.scrollTo(0,5500);")
            sleep(3)
            load_more = WebDriverWait(self.driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//span[@class='pagnRA']/a[@title='Next Page']")))
            self.driver.execute_script("arguments[0].click();", load_more)
            #load_more.click()
            sleep(3)
        sleep(3)


if __name__=='__main__':
    APP=App()
Ratmir Asanov
  • 6,237
  • 5
  • 26
  • 40