0

I manage to show all the post on a site where it has load_more button to go to the next page, but something is missing,

I got error of

e Error: Node is either not visible or not an HTMLElement
    at ElementHandle._clickablePoint (/Users/minghann/Documents/productnation_scraper/node_modules/puppeteer/lib/ExecutionContext.js:331:13)
    at <anonymous>
    at process._tickCallback (internal/process/next_tick.js:188:7)

Which doesn't happen if I don't load all the post. It's hard to debug because I don't know which post is missing what. Full code as below:

const browser = await puppeteer.launch({
  devtools: true
});
const page = await browser.newPage();

await page.goto("https://example.net");

await page.waitForSelector(".load_more_btn");

const load_more_exist = !!(await page.$(".load_more_btn"));

while (load_more_exist > 0) {
  await page.click(".load_more_btn");
}

const posts = await page.$$(".post");

let result = [];
for (const post of posts) {
  result = [
    ...result,
    {
      title: await post.$eval(".post_title a", e => e.innerText)
    }
  ];
}

console.log(result);

browser.close();
Md. Abu Taher
  • 17,395
  • 5
  • 49
  • 73
Thian Kian Phin
  • 921
  • 3
  • 13
  • 25

1 Answers1

1

There are multiple ways and best way is to combine the following two different ways.

Look for Ajax

Wait for request instead. Whenever you click on Load More, it will do a simple ajax request to ?ajax-request=jnews. We can use .waitForRequest or .waitForResponse for this use case. Here is a working example,

await Promise.all([
 page.waitForRequest(response => response.url().includes('?ajax-request=jnews') && response.status() === 200), 
 page.click(".load_more_btn")
])

Clean DOM and wait for new Element

Refer to these answers here and here.

Basically you can remove the dom elements that you collected, so next time you collect more data, there won't be any duplicates.

So, once you remove all current elements like document.querySelectorAll('.jeg_post'), you can simply do another page.waitFor('.jeg_post') later if you need.

Md. Abu Taher
  • 17,395
  • 5
  • 49
  • 73