52

Using Puppeteer, I would like to get all the elements on a page with a particular class name and then loop through and click each one.

Using jQuery, I can achieve this with:

var elements = $("a.showGoals").toArray();

for (i = 0; i < elements.length; i++) {
  $(elements[i]).click();
}

How would I achieve this using Puppeteer?

Update

Tried out Chridam's answer below, but I couldn't get it to work (though the answer was helpful, so thanks due there), so I tried the following and this works:

 await page.evaluate(() => {
   let elements = $('a.showGoals').toArray();
   for (i = 0; i < elements.length; i++) {
     $(elements[i]).click();
   }
});
Grant Miller
  • 27,532
  • 16
  • 147
  • 165
Richlewis
  • 15,070
  • 37
  • 122
  • 283
  • Actually with jQuery you can call `$("a.showGoals").toArray()` – Ele Feb 07 '18 at 21:55
  • Thanks for the jQuery tip :-) I've updated it on my question...Any ideas with puppeteer ? thanks – Richlewis Feb 07 '18 at 21:57
  • Possible duplicate of this question below. Check out the reply on it. https://stackoverflow.com/questions/51782734/puppeteer-find-array-elements-in-page-and-then-click – Thinkerer Apr 21 '19 at 03:52
  • Possible duplicate of [Puppeteer find array elements in page and then click](https://stackoverflow.com/questions/51782734/puppeteer-find-array-elements-in-page-and-then-click) – recnac Apr 25 '19 at 00:45

4 Answers4

54

Iterating puppeteer async methods in for loop vs. Array.map()/Array.forEach()

As all puppeteer methods are asynchronous it doesn't matter how we iterate over them. I've made a comparison and a rating of the most commonly recommended and used options.

For this purpose, I have created a React.Js example page with a lot of React buttons here (I just call it Lot Of React Buttons). Here (1) we are able set how many buttons to be rendered on the page; (2) we can activate the black buttons to turn green by clicking on them. I consider it an identical use case as the OP's, and it is also a general case of browser automation (we expect something to happen if we do something on the page). Let's say our use case is:

Scenario outline: click all the buttons with the same selector
  Given I have <no.> black buttons on the page
  When I click on all of them
  Then I should have <no.> green buttons on the page

There is a conservative and a rather extreme scenario. To click no. = 132 buttons is not a huge CPU task, no. = 1320 can take a bit of time.


I. Array.map

In general, if we only want to perform async methods like elementHandle.click in iteration, but we don't want to return a new array: it is a bad practice to use Array.map. Map method execution is going to finish before all the iteratees are executed completely because Array iteration methods execute the iteratees synchronously, but the puppeteer methods, the iteratees are: asynchronous.

Code example

const elHandleArray = await page.$$('button')

elHandleArray.map(async el => {
  await el.click()
})

await page.screenshot({ path: 'clicks_map.png' })
await browser.close()

Specialties

  • returns another array
  • parallel execution inside the .map method
  • fast

132 buttons scenario result: ❌

Duration: 891 ms

By watching the browser in headful mode it looks like it works, but if we check when the page.screenshot happened: we can see the clicks were still in progress. It is due to the fact the Array.map cannot be awaited by default. It is only luck that the script had enough time to resolve all clicks on all elements until the browser was not closed.

1320 buttons scenario result: ❌

Duration: 6868 ms

If we increase the number of elements of the same selector we will run into the following error: UnhandledPromiseRejectionWarning: Error: Node is either not visible or not an HTMLElement, because we already reached await page.screenshot() and await browser.close(): the async clicks are still in progress while the browser is already closed.


II. Array.forEach

All the iteratees will be executed, but forEach is going to return before all of them finish execution, which is not the desirable behavior in many cases with async functions. In terms of puppeteer it is a very similar case to Array.map, except: for Array.forEach does not return a new array.

Code example

const elHandleArray = await page.$$('button')

elHandleArray.forEach(async el => {
  await el.click()
})

await page.screenshot({ path: 'clicks_foreach.png' })
await browser.close()

Specialties

  • parallel execution inside the .forEach method
  • fast

132 buttons scenario result: ❌

Duration: 1058 ms

By watching the browser in headful mode it looks like it works, but if we check when the page.screenshot happened: we can see the clicks were still in progress.

1320 buttons scenario result: ❌

Duration: 5111 ms

If we increase the number of elements with the same selector we will run into the following error: UnhandledPromiseRejectionWarning: Error: Node is either not visible or not an HTMLElement, because we already reached await page.screenshot() and await browser.close(): the async clicks are still in progress while the browser is already closed.


III. page.$$eval + forEach

The best performing solution is a slightly modified version of bside's answer. The page.$$eval (page.$$eval(selector, pageFunction[, ...args])) runs Array.from(document.querySelectorAll(selector)) within the page and passes it as the first argument to pageFunction. It functions as a wrapper over forEach hence it can be awaited perfectly.

Code example

await page.$$eval('button', elHandles => elHandles.forEach(el => el.click()))

await page.screenshot({ path: 'clicks_eval_foreach.png' })
await browser.close()

Specialties

  • no side-effects of using async puppeteer method inside a .forEach method
  • parallel execution inside the .forEach method
  • extremely fast

132 buttons scenario result: ✅

Duration: 711 ms

By watching the browser in headful mode we see the effect is immediate, also the screenshot is taken only after every element has been clicked, every promise has been resolved.

1320 buttons scenario result: ✅

Duration: 3445 ms

Works just like in case of 132 buttons, extremely fast.


IV. for...of loop

The simplest option, not that fast and executed in sequence. The script won't go to page.screenshot until the loop is not finished.

Code example

const elHandleArray = await page.$$('button')

for (const el of elHandleArray) {
  await el.click()
}

await page.screenshot({ path: 'clicks_for_of.png' })
await browser.close()

Specialties

  • async behavior works as expected by the first sight
  • execution in sequence inside the loop
  • slow

132 buttons scenario result: ✅

Duration: 2957 ms

By watching the browser in headful mode we can see the page clicks are happening in strict order, also the screenshot is taken only after every element has been clicked.

1320 buttons scenario result: ✅

Duration: 25 396 ms

Works just like in case of 132 buttons (but it takes more time).


Summary

  • Avoid using Array.map if you only want to perform async events and you aren't using the returned array, use forEach or for-of instead. ❌
  • Array.forEach is an option, but you need to wrap it so the next async method only starts after all promises are resolved inside the forEach. ❌
  • Combine Array.forEach with $$eval for best performance if the order of async events doesn't matter inside the iteration. ✅
  • Use a for/for...of loop if speed is not vital and if the order of the async events does matter inside the iteration. ✅

Sources / Recommended materials

andromeda
  • 4,433
  • 5
  • 32
  • 42
theDavidBarton
  • 7,643
  • 4
  • 24
  • 51
  • I am quite puzzeled. I try to automatize some SAP-app. Using page.click("selector") works in the sense, that for the first element of the list the right action is executed. If I use the approach here, the expected actions (stuff getting selected) is not executed. When I do in that pages console (in pure chrome) document.getElementById(".xy").click(), I have the same effect, the elment is not selected. While actually clicking on it works. So, I do not why, but page.click() actually "better" simualtes a click. – Daniel Feb 08 '23 at 10:00
  • I suggest creating a dedicated question about this problem @Daniel, as it really depends on the page you try to automate. maybe you need to scroll the element into view to be able to click it, which `page.click` does automatically while plain JavaScript requires some additional steps to create optimal conditions to perform `node.click`. an informative discussion on the different click methods in puppeteer: https://github.com/puppeteer/puppeteer/issues/1805 – theDavidBarton Feb 08 '23 at 20:46
27

Use page.evaluate to execute JS:

const puppeteer = require('puppeteer');

puppeteer.launch().then(async browser => {
    const page = await browser.newPage();
    await page.evaluate(() => {
        let elements = document.getElementsByClassName('showGoals');
        for (let element of elements)
            element.click();
    });
    // browser.close();
});
Sir Sudo
  • 31
  • 1
  • 5
chridam
  • 100,957
  • 23
  • 236
  • 235
17

To get all elements, you should use page.$$ method, which is the same as [...document.querySelectorAll] (spread inside an array) from reqular browser API.

Then you could loop through it (map, for, whatever you like) and evaluate each link:

const getThemAll = await page.$$('a.showGoals')
getThemAll.forEach(async link => {
  await page.evaluate(() => link.click())
})

Since you also want to do actions with the things you got, I'd recommend using page.$$eval which will do the same as above and run an evaluation function afterwards with each of the elements in the array in one line. For example:

await page.$$eval('a.showGoals', links => links.forEach(link => link.click()))

To explain the line above better, $$eval returns an array of links, then it executes a callback function with the links as argument then it runs through every link via forEach method and finally execute the click function in each one.

Check the official documentation too, they have good examples there.

Rafael 'BSIDES' Pereira
  • 2,951
  • 5
  • 22
  • 24
  • 1
    The first solution isn't correct and won't work because the outer arrow function should be an _async_ function so the _await_ is allowed inside. Also the `link.click()` should be awaited too. While in the second (clickThemAll) solution also lacks the `await` before the `link.click()`. I would say fix these in the answer, but the main problem is that it suggests a bad practice: **such cases shouldn't be done by an `array.map`!** (I) A _map_ returns another array with the same length, in case of clicks we don't need it. (II) Inside the map we can't perform async events in order... – theDavidBarton Jun 21 '20 at 12:33
  • ... That can cause many issues if the squence is important and you want `await` to work as expected. Either use a `for...of` or a regular `for` loop (even a `forEach` can cause the same issues). – theDavidBarton Jun 21 '20 at 12:37
  • Fixed using [].forEach instead of [].map. I agree map is unnecessary as we don't need to return anything. Also, why are we worried about the order of events? It's not in OP's question. – Rafael 'BSIDES' Pereira Jun 22 '20 at 00:38
  • Sorry, I made the wrong wording about the order of events. I meant: the order of the `forEach` and the next async events order does matter. And it matters in all cases when we deal with puppeteer. The polyfill of forEach looks like this: `arr.forEach(callback(currentValue [, index [, array]])[, thisArg])` : the callback is called but it does not wait to be done before going to the next element of the array. If you run the code above you will see it throws: `UnhandledPromiseRejectionWarning: Error: Node is either not visible or not an HTMLElement`. While it does work in a for or for...of loop. – theDavidBarton Jun 22 '20 at 09:36
  • I will explain it in detail and compare the various options in a separate answer soon. – theDavidBarton Jun 22 '20 at 09:37
  • 1
    I see! Good to know, thank you! Also, I think your answer would be super helpful not only here but as a puppeteer example, as I constantly find different examples like my post everywhere. – Rafael 'BSIDES' Pereira Jun 23 '20 at 13:39
  • hi @bsides, I've just posted the mentioned comparison. Not that strong in technical explanations, but I think I made a thorough work with testing and benchmarking them out. I've referenced some reading material at the bottom of the answer from better authors than me . A slightly modified version of your `page.$$eval` + `Array.forEach` solution turned out to be the best performing one, so nice job! ;) I just use it without declaring it as a `const` (as it is not a reusable returned value). One syntax thing: you missed the closing parenthesis of the $$eval. – theDavidBarton Jun 27 '20 at 16:01
  • 1
    Oh, your post is a big one! I'm going to read it this week. Already took care of your comments too, thank you! – Rafael 'BSIDES' Pereira Jun 29 '20 at 01:27
  • `await page.$$eval('a.cls-context-menu-link', links => links.forEach(link => link.click()))` - the only one working snippet ... – Yordan Georgiev Jul 12 '20 at 08:23
5

page.$$() / elementHandle.click()

You can use page.$$() to create an ElementHandle array based on the given selector, and then you can use elementHandle.click() to click each element:

const elements = await page.$$('a.showGoals');

elements.forEach(async element => {
  await element.click();
});

Note: Remember to await the click in an async function. Otherwise, you will receive the following error:

SyntaxError: await is only valid in async function

Grant Miller
  • 27,532
  • 16
  • 147
  • 165