2

I'm using Chrome Puppeteer to get at some content on a Web page. This content is a list of items in a pseudo-table. I'm using XPath to get this content.

When I tested the Xpath expression [in Chrome with the the XPath Helper Extension] it displays the list of text, so I know the XPath expression is fine.

However, I'm having issues trying to do this with Puppeteer. Below is the relevant code [I omitted the opening and closing puppeteer code]:

var xpath_expr_str = "//div[contains(@class,'listings')]/div[4]/p/a";
var page_url_str = 'https://my-url';

await page.goto(page_url_str);
await page.waitForXPath(xpath_expr_str);

var xpath_payload_arr = await page.$x(xpath_expr_str);
var xpath_val_arr = await page.evaluate(function(payload_arr){
    var url_list_arr = [];
    for(var i = 0; i < payload_arr.length; i++)
    {
        url_list_arr.push(payload_arr[i].textContent);
    }
    return url_list_arr;
}, xpath_payload_arr);

console.log(xpath_val_arr);

When I run this, I get the following error:

UnhandledPromiseRejectionWarning: TypeError: Converting circular structure to JSON

I can't seem to get at the list. But, the thing is if I try to just get at a single item in the list, it works ok. For example, the following code works:

var xpath_val_str = await page.evaluate(function(payload_arr){
    return payload_arr.textContent;
}, xpath_payload_arr[0]);
console.log(xpath_val_str);

What's the proper way to manage XPath lists when working with Puppeteer?

ObiHill
  • 11,448
  • 20
  • 86
  • 135

1 Answers1

7

Unfortunately you cannot pass xpath_payload_arr into page.evaluate because it's a complex object that obviously contains somewhere a reference to itself. More on "Converting circular structure to JSON" error

However we can iterate over it in node context and page.evaluate items one by one:

var xpath_expr_str = '//*[@id="questions"]/div/div/h3/a';
var page_url_str = 'https://stackoverflow.com/questions/tagged/puppeteer';

await page.goto(page_url_str);
await page.waitForXPath(xpath_expr_str);

var xpath_payload_arr = await page.$x(xpath_expr_str);

var url_list_arr = [];
for(var i = 0; i < xpath_payload_arr.length; i++)
{
    url_list_arr.push(await page.evaluate(el => el.textContent, xpath_payload_arr[i]));
}

console.log(url_list_arr);

This produces the expected result:

xpath evaluation result

Vaviloff
  • 16,282
  • 6
  • 48
  • 56