Puppeteer: How do I download a file using chrome headless browser api?

Question

Using Puppeteer, how do I get the headless chrome browser to download a file (or make additional http requests and save the response)?

An API in Puppeteer is on its way (https://github.com/GoogleChrome/puppeteer/issues/299), however Headless Chrome needs to support downloads first. There's an open bug: https://bugs.chromium.org/p/chromium/issues/detail?id=696481. — ebidel, Aug 22 '17 at 00:31
Chrome headless support is almost there: https://chromium-review.googlesource.com/c/590913 — Andrey Lushnikov, Aug 22 '17 at 06:49

score 0 · Answer 1 · answered Apr 26 '21 at 13:51

Here's a tool I wrote to download all the resources that a page uses upon load.

This node command-line utility uses a headless browser (Puppeteer) to render a webpage and download all resources it may need. These resources including the original HTML are all saved locally...

https://github.com/stav/wgrep

score -1 · Answer 2 · edited Nov 02 '17 at 09:32

-1

You could make a simple request through the window, it should work. npm request

As soon as it returns the promise with your response, you could write an express Save function, and store the response.

It seems that puppeteer it has this implementation. See here: How to make a request with puppeteer.

Have a look over this:

Emitted when a page issues a request. The request object is read-only. In order to intercept and mutate requests, see page.setRequestInterceptionEnabled.

I hope this helps.

Link for setting headers

edited Nov 02 '17 at 09:32

titusfx

1,896
27
36

answered Aug 17 '17 at 04:27

sdet.ro

317
3
12

1

I would love to be able to just use node to automate downloading files. The problem with just making a regular http request in node is that it will not send the proper session cookies that are being managed by the headless browser. Using a full headless browser makes the task easier due to the extra functionality that it provides. – aherriot Aug 17 '17 at 12:36
What you need is headers. I just edited my response, showing an example on how to add headers. this way, regardless of your browser, the sessions are being stored, and the client will get a notification that a browser(defined in header) is trying to do an action. In your case, to download the files. – sdet.ro Aug 17 '17 at 12:47
That mechanism will let you look at requests being made (and alter them). I want to trigger additional requests that would not otherwise be made and save the response to disk. – aherriot Aug 17 '17 at 12:58
just add another function as parameter. or validation. :) – sdet.ro Aug 18 '17 at 05:38
Sorry, I don't understand. Can you please give a specific example? How do I make an http request inside the context of the chrome headless browser environment or the wrapper: Pupperteer? – aherriot Aug 18 '17 at 19:18

Puppeteer: How do I download a file using chrome headless browser api?

2 Answers2

Linked