27

I am trying to scroll to the end of a page so that I can make all the data visible and extract it. I tried to find a command for it but it's available in java (driver.executeScript) but couldn't find for python. Right now I am making the computer press the end key thousand times:

while i<1000:
    scroll = driver.find_element_by_tag_name('body').send_keys(Keys.END)
    i+=1

And I also tried driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") but it scrolls to the end of the loaded page and the same thing END key does. Once at the bottom of the page, next content loads. But now it doesn't scroll again.

I know there will be a very nice alternative for this.

How do I scroll to the end of the page using selenium in Python?

jww
  • 97,681
  • 90
  • 411
  • 885
psr
  • 2,619
  • 4
  • 32
  • 57
  • 1
    See if this helps : [http://stackoverflow.com/a/27760083/4193730](http://stackoverflow.com/a/27760083/4193730) – Subh Sep 04 '15 at 06:25
  • possible duplicate of [How can I scroll a web page using selenium webdriver in python?](http://stackoverflow.com/questions/20986631/how-can-i-scroll-a-web-page-using-selenium-webdriver-in-python) – Kavan Sep 04 '15 at 06:30
  • No this doesn't work because `driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")` scrolls to the end of the loaded page and the same thing END key does. Once at the bottom of the page, next content loads. But now it doesn't scroll. – psr Sep 04 '15 at 06:33
  • Is that page lazyloading content? Do you page down, it loads another chunk of content, page down, repeat? Or is it just a really long page? CTRL+END should jump to the very end of the page in one shot. – JeffC Sep 04 '15 at 18:36
  • No `CTRL + END` does the same thing as END – psr Sep 04 '15 at 20:02

5 Answers5

30

Well I finally figured out a solution:

lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
    match=False
        while(match==False):
                lastCount = lenOfPage
                time.sleep(3)
                lenOfPage = driver.execute_script("window.scrollTo(0, document.body.scrollHeight);var lenOfPage=document.body.scrollHeight;return lenOfPage;")
                if lastCount==lenOfPage:
                    match=True
psr
  • 2,619
  • 4
  • 32
  • 57
  • It's quite slow though, isn't it possible to speed it up somehow? – Sebastian Nielsen Feb 15 '17 at 18:57
  • @SebastianNielsen Probably late but just adjust time.sleep() fast as possible but without being too fast to make the browser or the site think you are a bot. 0.5 sec seems to work well. – user3326078 Jul 25 '18 at 20:40
  • 1
    @user3326078 Not very practical because of the probability of varying internet speed. The lowest possible sleep timer is depended on the internet speed. It would be awesome if I could figure out a solution that didn't depend on sleep, something along the lines await for the page to load and then scroll again. – Sebastian Nielsen Jul 25 '18 at 20:46
  • @SebastianNielsen Yea I agree wish there was a more robust/dynamic solution :/ – user3326078 Jul 25 '18 at 20:47
  • I just got an idea. What if you scroll to the bottom of the page and waits for the DOM's height to increase; we know that when it updates it must mean that the site has loaded more content, and we are therefore not at the bottom anymore - this will be looped until the website takes more than x seconds to increase in height when we reach the bottom. – Sebastian Nielsen Jul 25 '18 at 20:52
  • I don't quite understand the `match=False` and friends. What part is the actual solution? – jww Oct 31 '19 at 23:06
  • @jww The actual solution is `driver.execute_script`. Each time that is executed the page will scroll down. However, I don't understand how the OP mentioned that repeated execution of this script doesn't scroll his page. But in his solution he's doing exactly that and it works for him. As for the `match=False` part, that is just a flag to keep executing the script until you find that the total height returned remains constant. – Mugen Dec 24 '19 at 02:15
  • @psr You mentioned in the OP that `driver.execute_script` doesn't work for you more than once. And yet in the solution you relentlessly call it. How come it works for you now but not earlier? – Mugen Dec 24 '19 at 02:17
  • It works because I am talking about lazy loaded pages here. If I wait, then I am waiting for the page content to load and a scroll to actually show up. That is when the `driver.execute_script` starts working. – psr Dec 26 '19 at 22:49
20

This can be done in one line by scrolling to document.body.scrollHeight

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
Alex
  • 12,078
  • 6
  • 64
  • 74
  • 2
    This would not work on pages such as Facebook, that is constantly updating the DOC's height when you reach the bottom of the page. – Sebastian Nielsen Jul 25 '18 at 20:47
  • Fair point @SebastianNielsen, the solution is to use a while loop and break when the document height stops changing – Alex Jul 25 '18 at 21:26
15

None of these were working for me, but the below solution did:

driver.get("https://www.youtube.com/user/teachingmensfashion/videos")


def scroll_to_bottom(driver):

    old_position = 0
    new_position = None

    while new_position != old_position:
        # Get old scroll position
        old_position = driver.execute_script(
                ("return (window.pageYOffset !== undefined) ?"
                 " window.pageYOffset : (document.documentElement ||"
                 " document.body.parentNode || document.body);"))
        # Sleep and Scroll
        time.sleep(1)
        driver.execute_script((
                "var scrollingElement = (document.scrollingElement ||"
                " document.body);scrollingElement.scrollTop ="
                " scrollingElement.scrollHeight;"))
        # Get new position
        new_position = driver.execute_script(
                ("return (window.pageYOffset !== undefined) ?"
                 " window.pageYOffset : (document.documentElement ||"
                 " document.body.parentNode || document.body);"))

scroll_to_bottom(driver)
user53558
  • 327
  • 1
  • 6
  • 11
4

You can utilize scrollingElement with scrollTop and scrollHeight to to scroll to the end of a page.

driver.execute_script("var scrollingElement = (document.scrollingElement || document.body);scrollingElement.scrollTop = scrollingElement.scrollHeight;")

References :

  1. Scroll Automatically to the Bottom of the Page
  2. Document.scrollingElement - Web APIs | MDN
  3. Element.scrollHeight - Web APIs | MDN
  4. Element.scrollTop - Web APIs | MDN
0x48piraj
  • 395
  • 3
  • 12
0

Since there is no link provided for the website I am going to assume that there is some kind of See More/Load More clickable element present on the page. Here is what I like to and its pretty simple.

count=10000
while count>1:
   try:
       button=driver.find_element_by_xpath('//*[@id="load_more"]')
       button.click()
       count-=1
       time.sleep(2)
   except StaleElementReferenceException:
       button=driver.find_element_by_xpath('//*[@id="load_more"]')
       button.click()
       time.sleep(2)
Nitin Kumar
  • 69
  • 1
  • 6