2

In the following code sample, the log statements inside the function passed as an argument to page.evaluate() are not being printed to the Node console (the terminal). The log statement outside this function (at the end) is printing as expected though.

Another programmer suggested that maybe the page.evaluate context is the headless browser environment and not Node.js.

 const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.setExtraHTTPHeaders({Referer: 'https://sparktoro.com/'});

    await page.goto('https://sparktoro.com/trending');
    await page.waitForSelector('div.title > a');

    const stories = await page.evaluate(() => {
        const selection = document.querySelectorAll('div.title > a');
        console.log('selection', selection);
        const links_array = Array.from(selection);
        console.log('links_array', links_array);
        const hrefs = links_array.map(anchor => anchor.href)
        console.log('hrefs', hrefs);
        return hrefs
    });

    console.log(stories);
    await browser.close();
})();

Is there any way to force all console.log statements to use the Node environment as their context, or is my only option to enable the browser head and read the statements from the browser console?

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875
Sean D
  • 3,810
  • 11
  • 45
  • 90

1 Answers1

4

Another programmer suggested that maybe the page.evaluate context is the headless browser environment and not Node.js.

Right, this is one of the keys to Puppeteer, code within the evaluate callback is evaluated in the browser, not in your Node process (even though that's not at all obvious looking at the source of code using Puppeteer).

You can respond to the console event of the Page object, which is raised when any console method (log, error, etc.) is called in the client-side code. From that link:

page.on('console', msg => {
  for (let i = 0; i < msg.args().length; ++i)
    console.log(`${i}: ${msg.args()[i]}`);
});

The console event receives a ConsoleMessage object, which tells you what type of call it was (log, error, etc.), what the arguments were (args()), etc.

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875
  • 2
    One thing to keep in mind, since the arguments need to be transferred from the browser to Node over HTTP, Puppeteer needs to serialize them, and the last time I checked it fails to serialize recursive objects such as `window`, so it's best to have a custom implementation of `console.log` in the browser similar to `util.inspect`, that stringifies all arguments and then calls the native `console.log`. – Alexey Lebedev Aug 18 '18 at 11:08
  • @AlexeyLebedev - Ooh, fun fun. Thanks for flagging that up. – T.J. Crowder Aug 18 '18 at 11:09
  • If you're interested in getting JSON data for logged objects, see [this thread](https://stackoverflow.com/questions/58089425/how-do-print-the-console-output-of-the-page-in-puppeter-as-it-would-appear-in-th) – ggorlen Nov 24 '20 at 21:16