To control how many simultaneous requests are running at once, you can use any of these three options:
mapConcurrent()
here and pMap()
here: These let you iterate an array, sending requests to a host, but manages things so that you only ever have N requests in flight at the same time where you decide what the value of N is.
rateLimitMap()
here: Let's you manage how many requests per second are sent.
Can this be solved using a custom https agent with node-fetch and setting the maxSockets to something like 10?
I'm not aware of any solution using a custom https agent.
How do i check if the file exists on the server and if it does then download it on my machine with the same file name and extension?
You can't directly access a remote http server's file system. So, all you can do is make an http request for a specific resource (a url) and examine the http response to see if it returned data or returned some sort of http error such as a 404.
As for filenames and extensions, that depends entirely upon whether you already know what to request and the server supports that being part of the URL or whether the server returns to you that information in an http header. If you're requesting specific filename and extension, then you can just create a file with that name and extension and save the http response data to that file on your local drive.
As for coding examples, the doc for node-fetch()
shows examples of downloading data to a file using streams here: https://www.npmjs.com/package/node-fetch#streams.
import {createWriteStream} from 'fs';
import {pipeline} from 'stream';
import {promisify} from 'util'
import fetch from 'node-fetch';
const streamPipeline = promisify(pipeline);
const url='https://github.githubassets.com/images/modules/logos_page/Octocat.png';
const response = await fetch(url);
if (!response.ok) throw new Error(`unexpected response ${response.statusText}`);
await streamPipeline(response.body, createWriteStream('./octocat.png'));
Personally, I wouldn't use node-fetch
as it's design center is to mimic the browser implementation of node which is not as friendly an API design as similar libraries built explicitly for nodejs. I use got()
, and there are several other good libraries listed here. You can pick your favorite.
Here's a code example using the got()
library:
import {promisify} from 'node:util';
import stream from 'node:stream';
import fs from 'node:fs';
import got from 'got';
const pipeline = promisify(stream.pipeline);
await pipeline(
got.stream('https://sindresorhus.com'),
fs.createWriteStream('index.html')
);