Questions tagged [puppeteer-cluster]

puppeteer-cluster manages a pool of headless browsers via puppeteer. This is useful to crawl multiple pages in parallel or to keep a pool of open browsers.

puppeteer-cluster creates a pool of puppeteer workers by spawning multiple browsers, contexts or pages via puppeteer. The library keeps track of queued jobs and handles thrown errors. In addition, it allows to retry jobs or introduce delays when crawling a domain.

Resources:

73 questions

votes

0 answers

How to improve puppeteer performance using launch args (using chromium in headless mode)?

Hi I am using puppeteer for crawling webpages(~1 Million records). For managing long crawls I am using puppeteer-cluster node module. What are the flags that are already enabled when launching chromium using puppeteer? list of args What are some…

asked Jun 23 '21 at 07:25

Rajat

votes

0 answers

Open Chrome without it taking focus (Puppeteer)

I'm using Puppeteer to launch multiple browsers - Every few minutes, it'll reopen the browsers. This works fine, except it's constantly opening browsers and focusing the tabs, bothering me while I'm trying to use the computer. Due to what I'm trying…

google-chrome puppeteer puppeteer-cluster

asked Sep 02 '21 at 03:55

Lawlzer

votes

2 answers

How do I combine puppeteer plugins with puppeteer clusters?

I have a list of urls that need to be scraped from a website that uses React, for this reason I am using Puppeteer. I do not want to be blocked by anti-bot servers, for this reason I have added puppeteer-extra-plugin-stealth I want to prevent ads…

javascript node.js puppeteer puppeteer-cluster

asked Dec 24 '20 at 18:45

Daggie Blanqx - Douglas Mwangi

2,309
23
28

votes

0 answers

puppeteer: Protocol error (Runtime.callFunctionOn): Target closed

I came a across a website that puppeteer can't handle. When making screenshot, Protocol error (Runtime.callFunctionOn): Target closed or Protocol error (Emulation.setDeviceMetricsOverride): Target closed is triggered. Before taking a screenshot, I…

puppeteer puppeteer-cluster

asked Jul 10 '22 at 13:24

sanjihan

5,592
11
54
119

votes

0 answers

Puppeteer does not use cache when connected to proxy

I have a task that opens a browser and then visits the same page over and over again. After testing network usage I've noticed that without proxies it caches files just fine, but as soon as I am connecting to a proxy it stops caching. I am using…

caching puppeteer puppeteer-cluster

asked Nov 24 '21 at 16:59

Farruh Sydykov

votes

2 answers

Is Puppeteer-Cluster Stealthy enough to pass bot tests?

I wanted to know if anyone using Puppeteer-Cluster could elaborate on how the Cluster.Launch({settings}) protects against sharing of cookies and web data between pages in different context. Do the browser contexts here, actually block cookies and…

node.js automated-tests puppeteer end-to-end puppeteer-cluster

asked Jan 09 '20 at 21:08

Peyter

votes

3 answers

Puppeteer: how to wait only first response (HTML)

I'm using puppeteer-cluster to crawling web pages. If I open many pages at time per single website (8-10 pages), the connection slow down and many timeout errors coming up, like this: TimeoutError: Navigation Timeout Exceeded: 30000ms exceeded I…

node.js puppeteer puppeteer-cluster

asked Sep 13 '19 at 08:12

user3817605

votes

1 answer

puppeteer-cluster: queue instead of execute

I'm experimenting with Puppeteer Cluster and I just don't understand how to use queuing properly. Can it only be used for calls where you don't wait for a response? I'm using Artillery to fire a bunch of requests simultaneously, but they all fail…

javascript node.js puppeteer puppeteer-cluster

asked Aug 05 '19 at 14:51

G_V

2,396
29
44

votes

2 answers

Unable to Run Multiple Node Child Processes without Choking on DigitalOcean

I've been struggling to run multiple instances of Puppeteer on DigitalOcean for quite some time with little luck. I'm able to run ~5 concurrently using tools like puppeteer-cluster, but for some reason the whole thing just chokes with little helpful…

node.js parallel-processing digital-ocean puppeteer puppeteer-cluster

asked Jul 30 '19 at 21:44

Alex MacArthur

2,220
1
18
22

votes

0 answers

how to unite cheerio with puppeteer so he can click on elements

I tried cheerio to find the element and if the element is found then he has to click but I don't know what to do with the puppeteer combination, the button I want to click is in the 3rd pict await page.waitForTimeout(10000) const contentHTML…

javascript puppeteer cheerio puppeteer-cluster

asked Feb 19 '23 at 09:56

Alma Muhamad Apriana

votes

1 answer

How To Passing Multiple Data in Puppeteer-Cluster

Just one question. How can i do this? I have these data : url : http://example.com and 2 string data, example : firstName and lastName The url is still the same in every browser, but, firstName and lastName will be changed every browser…

node.js puppeteer puppeteer-cluster

asked Mar 04 '22 at 07:24

Getol99

votes

1 answer

I can't use a rotating IP proxy in my puppeteer cluster script

I am trying to run this code with multiple address ips but I think I put the proxy code in the wrong place can someone help, the proxy dashboard shows that the code uses the proxy but when he opened the browser the address IP doesn't change is still…

javascript node.js bots puppeteer puppeteer-cluster

asked Dec 28 '21 at 14:42

Larbi Ait Soussi

votes

1 answer

Navigation failed because browser has disconnected

I ran into the following problem. Here's the error message: Error: Navigation failed because browser has disconnected! at /Users/me/myproject/node_modules/puppeteer/lib/cjs/puppeteer/common/LifecycleWatcher.js:51:147 at…

puppeteer puppeteer-cluster

asked Aug 29 '20 at 08:17

David McNamee

votes

1 answer

How do I reset a for loop inside an async function?

So I found a website that has very cool images and I'd like to scrape some of its data. The website didn't get any update for about 5 years and I tried to contact its owner for some kind of API and I didn't get any response back. Anyway, the website…

javascript node.js puppeteer puppeteer-cluster

asked Nov 20 '19 at 15:30

doingmybest

votes

3 answers

How to handle multiple tabs in puppeteer-cluster[CONCURRENCY_BROWSER]?

I'm attempting scraping for 3 urls with below conditions Each url need to run in a separate browser. The url may consist of 2 or more links to click Open the links in new tab of the respective browsers (paralleled) and switch to it and scrape the…

javascript node.js puppeteer puppeteer-cluster

asked Jul 26 '19 at 14:54

Ajai Ganesh

2 3 4 5 Next