3

I'm checking a python library: requests-html. Looks interesting, easy and clear scraping. However, I'm not sure how to render a page with infinite scrolling.

From their documentation I understand that I should render a page with special attribute (scrolldown). I'm trying but I do not know how exactly. I know how to use selenium to handle infinite scroll, but I wonder whether it is possible with requests-html.

from requests_html import  HTML, HTMLSession

page1 = session.get(url1)
page1.html.render( scrolldown=5,sleep=3)
html = HTML(html=page1.text)
noticeName = html.find('h2.noticeName')
for element in noticeName:
    print(element.text)

It finds 10 elements from 13. 10 is visible without scrolling (and loading new content because of infinite scroll).

Maciek
  • 33
  • 1
  • 8

2 Answers2

0

scrolldown=5 means scroll 5 pixel down, is your monitor that small?? or vm height that small?? now give it a bigger value like height of the screen with sleep or 2000 or 5000 without sleep

And it will not give you uniquely next elements, it will give you exactly all elements from the starting.

I will add some sample code soon.

Maifee Ul Asad
  • 3,992
  • 6
  • 38
  • 86
0

I hope you've solved this already, but I'll post this for any other curious souls.

In most cases, if you want to infinite scroll, scrolldown needs to be a large value because it is based on the number of times requests_html will send a "page down" request in Chromium.

According to the docs:

scrolldown – Integer, if provided, of how many times to page down.

However, the requests_html uses the pyppeteer library which sends a page down as a key press. This means that if you are on a page that blocks the page down keys or simply doesn't infinite scroll using only the key presses, you will need a different solution.

Alternative solution (in Javascript)

Documentation: requests_html (archived)