0

I'm trying to scroll an auto loading page , and while doing it I want to fetch the appearing (and disapearing elements).

My code looks like that , the scrolling works great, but I'm not able to make my puppeteer code work in order to detect the elements and save their values (the code does work outside the scroll functions)

async function autoScroll(page) {
    await page.evaluate(async () => {
        await new Promise((resolve, reject) => {
            let totalHeight = 0;
            let distance = 100;
            let timer = setInterval(async () => {
                let scrollHeight = document.body.scrollHeight;
                window.scrollBy(0, distance);
                totalHeight += distance;
                console.log("scrolling"); // That one never shows up 
                await getUsers(); // Trying to fetch elements on every scroll
                if (totalHeight >= scrollHeight) {
                    clearInterval(timer);
                    resolve();
                }
            }, 100);
        });
    });
}


async function getUsers() {
    let hrefs = await page.$$('div > a');
    for (let i = 0; i < hrefs.length; i = i++) { adding each link to database }

-- What I want to achieve is , that every time I scroll to the bottom of the page , the getUsers functions will fetch all the links in spesific div and add them to the DB if they they don't exist yet but calling the function from the SetInterval doesn't seem to work

How can I Include my puppeteer async function while scrolling through the page?

user23634623
  • 151
  • 5
  • 11

1 Answers1

0

the code does work outside the scroll functions

The getUsers function is defined in the main node.js script, but in autoScroll it is used inside of page.evaluate function, and the code inside page.evaluate runs in the browser context (as if we run it in DevTools console) where there is no getUsers function.

Since getUsers works with a database it can only work on node.js side, not in page.evaluate, you should rewrite the scraping code.

I'd suggest first getting userdata inside of page.evaluate and only after the page doesn't scroll anymore return the data to the main context and then save to the databaswe.


console.log from page.evaluate is not shown becase you need to specifically subscribe to it to see console messages.

Vaviloff
  • 16,282
  • 6
  • 48
  • 56
  • I've tried to declare it like that : async function autoScroll(page) { await page.evaluate(async () => { async function getUsers() { // some code} await new Promise((resolve, reject) => { The promise code + call for getUsers() } } , but it doesn't seem to work :/ how do it do that? – user23634623 Aug 12 '19 at 19:29
  • It depends on what getUsers does, could you add a sample code of it to the question? – Vaviloff Aug 12 '19 at 20:27
  • Okay, changed the answer. Please note that this question isn't about how to change the code, it was about why it didn't work in the first place. But if you want you can always open another one. – Vaviloff Aug 13 '19 at 07:13
  • You are correct ! I've figured it out thanks to you :) – user23634623 Aug 13 '19 at 09:06