I'm trying to scrape a website, that has javascript executing, adding new information when user scrolls down. I use this function to get DOM:
library(crrri)
dump_DOM <- function(url) {
perform_with_chrome(function(client) {
Network <- client$Network
Page <- client$Page
Runtime <- client$Runtime
Network$enable() %...>% {
Page$enable()
} %...>% {
Network$setCacheDisabled(cacheDisabled = TRUE)
} %...>% {
Page$navigate(url = url)
} %...>% {
Page$loadEventFired()
} %...>% {
Runtime$evaluate(
expression = 'document.documentElement.outerHTML'
)
} %...>% (function(result) {
html <- result$result$value
return(html)
})
},
extra_args = '--no-sandbox')
}
website <- dump_DOM(url)
I couldn't find how to scroll the page in headless chrome, so I tried to change the window size to no avail, by adding these lines inside the function:
Emulation <- client$Emulation
Network$enable() %...>% {
Page$enable()
} %...>% {
Emulation$setDeviceMetricsOverride(
width = 1080,
height = 10000,
deviceScaleFactor = 0,
mobile = FALSE,
dontSetVisibleSize = FALSE
)
} %...>% {
....
So the question is - how do I scroll the page down to the bottom? Alternatively, how make the 'window size' huge enough that it loads the full page without need to scroll down?