The difference between page.$$eval
(and other evaluate
-style methods, with the exception of evaluateHandle
) and page.$$
is that the evaluate
family only works with serializable values. As you discovered, you can't return elements from these methods because they're not serialiable (they have circular references and would be useless in Node anyway).
On the other hand, page.$$
returns Puppeteer ElementHandles that are references to DOM elements that can be manipulated from Puppeteer's API in Node rather than in the browser. This is useful for many reasons, one of which is that ElementHandle.click()
issues a totally different set of operations than running the native DOMElement.click()
in the browser.
From the comments:
An example of what I'm trying to get is: <div class = "class">This is the innerHTML text I want. </div>
. On the page, it's text inside a clickable portion of the website. What i want to do is loop through the available options, then click on the ones that match an innerHTML
I'm looking for.
Here's a simple example you should be able to extrapolate to your actual use case:
const puppeteer = require("puppeteer"); // ^19.1.0
const {setTimeout} = require("timers/promises");
const html = `
<div>
<div class="class">This is the innerHTML text I want.</div>
<div class="class">This is the innerHTML text I don't want.</div>
<div class="class">This is the innerHTML text I want.</div>
</div>
<script>
document.querySelectorAll(".class").forEach(e => {
e.addEventListener("click", () => e.textContent = "clicked");
});
</script>
`;
const target = "This is the innerHTML text I want.";
let browser;
(async () => {
browser = await puppeteer.launch();
const [page] = await browser.pages();
await page.setContent(html);
///////////////////////////////////////////
// approach 1 -- trusted Puppeteer click //
///////////////////////////////////////////
const handles = await page.$$(".class");
for (const handle of handles) {
if (target === (await handle.evaluate(el => el.textContent))) {
await handle.click();
}
}
// show that it worked and reset
console.log(await page.$eval("div", el => el.innerHTML));
await page.setContent(html);
//////////////////////////////////////////////
// approach 2 -- untrusted native DOM click //
//////////////////////////////////////////////
await page.$$eval(".class", (els, target) => {
els.forEach(el => {
if (target === el.textContent) {
el.click();
}
});
}, target);
// show that it worked and reset
console.log(await page.$eval("div", el => el.innerHTML));
await page.setContent(html);
/////////////////////////////////////////////////////////////////
// approach 3 -- selecting with XPath and using trusted clicks //
/////////////////////////////////////////////////////////////////
const xp = '//*[@class="class"][text()="This is the innerHTML text I want."]';
for (const handle of await page.$x(xp)) {
await handle.click();
}
// show that it worked and reset
console.log(await page.$eval("div", el => el.innerHTML));
await page.setContent(html);
///////////////////////////////////////////////////////////////////
// approach 4 -- selecting with XPath and using untrusted clicks //
///////////////////////////////////////////////////////////////////
await page.evaluate(xp => {
// https://stackoverflow.com/a/68216786/6243352
const $x = xp => {
const snapshot = document.evaluate(
xp, document, null,
XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null
);
return [...Array(snapshot.snapshotLength)]
.map((_, i) => snapshot.snapshotItem(i))
;
};
$x(xp).forEach(e => e.click());
}, xp);
// show that it worked
console.log(await page.$eval("div", el => el.innerHTML));
})()
.catch(err => console.error(err))
.finally(() => browser?.close());
Output in all cases is:
<div class="class">clicked</div>
<div class="class">This is the innerHTML text I don't want.</div>
<div class="class">clicked</div>
Note that ===
might be too strict without calling .trim()
on the textContent
first. You may want an .includes()
substring test instead, although the risk there is that it's too permissive. Or a regex may be the right tool. In short, use whatever makes sense for your use case rather than (necessarily) my ===
test.
With respect to the XPath approach, this answer shows a few options for dealing with whitespace and substrings.