I'm using Puppeteer
to check for dead links on a site. These links will have a h1
with "Page not found" as the content. My understanding is that page.evaluate
gives access to the DOM
, but when I try to use it here I get undefined
. I've tried a few different ways of accessing this data (.$
, .$eval
) but so far nothing has worked.
When I enter document.querySelector('h1', el => el.textContent)
in my dev tools it works correctly. I'm also setting the userAgent, so I'm fairly sure the site doesn't think I'm a bot.
const puppeteer = require('puppeteer');
const prepareForTests = async(page) => {
const userAgent = 'Mozilla/5.0 (X11; Linux x86_64)' +
'AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.39 Safari/537.36';
await page.setUserAgent(userAgent);
}
(async() => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await prepareForTests(page);
const link = await page.goto("https://www.example.com");
console.log(await page.evaluate(() => {
document.querySelector('h1', el => el.textContent);
}));
await browser.close();
process.exit();
})();