36

I am trying out Puppeteer. This is a sample code that you can run on: https://try-puppeteer.appspot.com/

The problem is this code is returning an array of empty objects:

[{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{},{}]

Am I making a mistake?

const browser = await puppeteer.launch();

const page = await browser.newPage();
await page.goto('https://reddit.com/');

let list = await page.evaluate(() => {
  return Promise.resolve(Array.from(document.querySelectorAll('.title')));
});

console.log(JSON.stringify(list))

await browser.close();
Grant Miller
  • 27,532
  • 16
  • 147
  • 165
Abdullah Alsigar
  • 1,236
  • 1
  • 12
  • 16
  • `Promise.resolve` isn't doing anything here, in addition to the DOM nodes not being JSON serializable. – ggorlen Mar 15 '22 at 15:05

3 Answers3

55

The values returned from evaluate function should be json serializeable. https://github.com/GoogleChrome/puppeteer/issues/303#issuecomment-322919968

the solution is to extract the href values from the elements and return it.

 await this.page.evaluate((sel) => {
        let elements = Array.from(document.querySelectorAll(sel));
        let links = elements.map(element => {
            return element.href
        })
        return links;
    }, sel);
Abdullah Alsigar
  • 1,236
  • 1
  • 12
  • 16
  • The docs are unclear to me because their link to `Serializable` goes to the `JSON.stringify` definition, which clearly states objects as serializable (and they obviously are). Nevertheless, a simple `await page.evaluate(_ => { a: 1 })` will return `undefined` – gilad905 Sep 09 '20 at 15:06
  • Not sure if you mistyped. But if you're trying to return that object using the shorthand notation, you need to wrap the return object; `await page.evaluate(_ => ({ a: 1 }))`. Could definitely be the cause for getting undefined. – PigBoT Jan 08 '21 at 14:07
18

Problem:

The return value for page.evaluate() must be serializable.

According to the Puppeteer documentation, it says:

If the function passed to the page.evaluate returns a non-Serializable value, then page.evaluate resolves to undefined. DevTools Protocol also supports transferring some additional values that are not serializable by JSON: -0, NaN, Infinity, -Infinity, and bigint literals.

In other words, you cannot return an element from the page DOM environment back to the Node.js environment because they are separate.

Solution:

You can return an ElementHandle, which is a representation of an in-page DOM element, back to the Node.js environment.

Use page.$$() to obtain an ElementHandle array:

let list = await page.$$('.title');

Otherwise, if you want to to extract the href values from the elements and return them, you can use page.$$eval():

let list = await page.$$eval('.title', a => a.href);
Grant Miller
  • 27,532
  • 16
  • 147
  • 165
  • `let list = await page.$$eval('.title', a => a.href);` is incorrect. It would need to be `const list = await page.$$eval('.title', a => a.map(e => e.href));` – ggorlen Jan 03 '23 at 19:14
14

I faced the similar problem and i solved it like this;

 await page.evaluate(() => 
       Array.from(document.querySelectorAll('.title'), 
       e => e.href));
Can Ali
  • 341
  • 3
  • 10
  • 8
    TIL `Array.From` takes a callback map function – phillyslick Jan 19 '20 at 00:33
  • 1
    Consider using `page.$$eval(".title", els => els.map(el => el.href))`. `$$eval` is provided as a convenience to avoid the commonplace pattern of having to run `documentquerySelectorAll()` as the first line of a browser function. – ggorlen Jan 03 '23 at 19:13