1

I'm using Puppeteer to check for dead links on a site. These links will have a h1 with "Page not found" as the content. My understanding is that page.evaluate gives access to the DOM, but when I try to use it here I get undefined. I've tried a few different ways of accessing this data (.$, .$eval) but so far nothing has worked.

When I enter document.querySelector('h1', el => el.textContent) in my dev tools it works correctly. I'm also setting the userAgent, so I'm fairly sure the site doesn't think I'm a bot.

const puppeteer = require('puppeteer'); 

const prepareForTests = async(page) => {
    const userAgent = 'Mozilla/5.0 (X11; Linux x86_64)' +
    'AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.39 Safari/537.36';
  await page.setUserAgent(userAgent);
}

(async() => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await prepareForTests(page);

    const link = await page.goto("https://www.example.com");

    console.log(await page.evaluate(() => {
        document.querySelector('h1', el => el.textContent);
    }));

    await browser.close();
    process.exit();
})();

lightfoot34
  • 89
  • 1
  • 8

1 Answers1

1

It seems this fragment is the issue:

console.log(await page.evaluate(() => {
  document.querySelector('h1', el => el.textContent);
}));
  1. The function does not return the value.
  2. document.querySelector() accepts only 1 argument, so the second function is ignored.

Try this:

console.log(await page.evaluate(() => {
  return document.querySelector('h1').textContent;
}));
vsemozhebuty
  • 12,992
  • 1
  • 26
  • 26