72

I have the url to a possibly large (100+ Mb) file, how do I save it in a local directory using fetch?

I looked around but there don't seem to be a lot of resources/tutorials on how to do this.

Thank you!

VLAZ
  • 26,331
  • 9
  • 49
  • 67
Gloomy
  • 1,091
  • 1
  • 9
  • 18

8 Answers8

106

Updated solution on Node 18:

const fs = require("fs");
const {mkdir,writeFile} = require("fs/promises");
const { Readable } = require('stream');
const { finished } = require('stream/promises');
const path = require("path");
const downloadFile = (async (url, folder=".") => {
  const res = await fetch(url);
  if (!fs.existsSync("downloads")) await mkdir("downloads"); //Optional if you already have downloads directory
  const destination = path.resolve("./downloads", folder);
  const fileStream = fs.createWriteStream(destination, { flags: 'wx' });
  await finished(Readable.fromWeb(res.body).pipe(fileStream));
});

downloadFile("<url_to_fetch>", "<filename>")

Old Answer works till Node 16:

Using the Fetch API you could write a function that could download from a URL like this:

You will need node-fetch@2 run npm i node-fetch@2

const fetch = require("node-fetch");
const fs = require("fs");
const downloadFile = (async (url, path) => {
  const res = await fetch(url);
  const fileStream = fs.createWriteStream(path);
  await new Promise((resolve, reject) => {
      res.body.pipe(fileStream);
      res.body.on("error", reject);
      fileStream.on("finish", resolve);
    });
});
code_wrangler
  • 1,351
  • 1
  • 8
  • 9
  • 5
    You could even make it a little shorter by writing `res.body.on('error', reject);` and `fileStream.on('finish', resolve);`. – Ricki-BumbleDev Jun 14 '20 at 10:21
  • 8
    This gives an error: res.body.pipe is not a function. NodeJS v18 – tinkerr Apr 24 '22 at 06:53
  • The function which calls downloadFile does not wait for it to resolve the promise. I'm calling this function like this-> await downloadFile(URL, path). Would you mind correcting me? – Swapnil Jun 21 '22 at 03:19
  • 2
    @tinkerr try importing and using 'node-fetch' instead of the normal fetch – Alex Totolici Jun 24 '22 at 06:59
  • just style preferences but especially for short example code I much prefer the explicit `async function downloadFile` style over `const somevar = ` – Purefan Aug 30 '22 at 12:58
  • Shouldn't an `async` function always return a `Promise`, rather than `await`ing one? – serverpunk Nov 28 '22 at 21:59
  • @serverpunk `downloadFile` will still return an empty promise because of the `async` keyword, but it won't return until awaiting the inner anonymous promise – sloreti Dec 03 '22 at 19:46
  • @sloreti This might be a minor detail but it seems like `downloadFile` should return the `new Promise` and the outer code should call `await downloadFile()`, if I'm understanding the expected behavior of `async` functions correctly. – serverpunk Dec 04 '22 at 00:01
  • @serverpunk That entirely depends on what one wants the behavior of `downloadFile` to be! What I think you're describing would be effectively the same as the current answer – sloreti Dec 05 '22 at 01:27
  • @Purefan https://en.wiktionary.org/wiki/bikeshedding – Ahmed Fasih Dec 06 '22 at 18:50
  • @AhmedFasih I thought that was implied by me saying "just style preferences" but Im happy that you were able to still identify what I meant, good for you! :) – Purefan Dec 19 '22 at 17:14
  • 2
    This does not work on node v18. I think https://stackoverflow.com/a/74722818 is a better solution in 2023. – Bob Jan 25 '23 at 01:55
  • Node 16 answer works on Node 18, Node 18 answer leads to 0 byte file. – Greggory Wiley Jun 27 '23 at 19:45
28

Older answers here involve node-fetch, but since Node.js v18.x this can be done with no extra dependencies.

The body of a fetch response is a web stream. It can be converted to a Node fs stream using Readable.fromWeb, which can then be piped into a write stream created by fs.createWriteStream. If desired, the resulting stream can then be turned into a Promise using the promise version of stream.finished.

const fs = require('fs');
const { Readable } = require('stream');
const { finished } = require('stream/promises');

const stream = fs.createWriteStream('output.txt');
const { body } = await fetch('https://example.com');
await finished(Readable.fromWeb(body).pipe(stream));
schankam
  • 10,778
  • 2
  • 15
  • 26
antonok
  • 516
  • 6
  • 9
  • That can also be nicely compacted in one line `const download = async (url, path) => Readable.fromWeb((await fetch(url)).body).pipe(fs.createWriteStream(path))` – Jamby Dec 29 '22 at 08:42
  • 1
    Does this download the entire file (`await fetch(...)`) before starting the write stream? – 1252748 Feb 02 '23 at 00:50
  • 2
    @1252748 `await fetch(...)` finishes after the response headers are fully received, but before the response body is received. The body will be streamed into the file while it is arriving. The second `await` can be omitted to perform other tasks while the body stream is still in progress. – antonok Feb 02 '23 at 22:09
  • 2
    `Argument of type 'ReadableStream' is not assignable to parameter of type 'ReadableStream'. Type 'ReadableStream' is missing the following properties from type 'ReadableStream': values, [Symbol.asyncIterator]ts(2345)` – RonH Mar 08 '23 at 13:41
  • 3
    @RonH unfortunately it looks like there are 2 _different_ `ReadableStream` definitions, as per https://stackoverflow.com/questions/63630114/argument-of-type-readablestreamany-is-not-assignable-to-parameter-of-type-r. You should be able to cast `body` to the correct `ReadableStream` from `'stream/web'`; i.e. `import { ReadableStream } from 'stream/web';` and `body as ReadableStream`. – antonok Mar 08 '23 at 22:11
  • Ends in 0 byte file for me. – Greggory Wiley Jun 27 '23 at 19:37
  • Could probably be rewrited with propre `import` as Node support it easily. – Hugo Gresse Aug 29 '23 at 14:48
25

If you want to avoid explicitly making a Promise like in the other very fine answer, and are ok with building a buffer of the entire 100+ MB file, then you could do something simpler:

const fetch = require('node-fetch');
const {writeFile} = require('fs');
const {promisify} = require('util');
const writeFilePromise = promisify(writeFile);

function downloadFile(url, outputPath) {
  return fetch(url)
      .then(x => x.arrayBuffer())
      .then(x => writeFilePromise(outputPath, Buffer.from(x)));
}

But the other answer will be more memory-efficient since it's piping the received data stream directly into a file without accumulating all of it in a Buffer.

Ahmed Fasih
  • 6,458
  • 7
  • 54
  • 95
  • 1
    I have tried this code but got error...I got error [Error: EISDIR: illegal operation on a directory, open 'D:\Work\repo\'] { errno: -4068, code: 'EISDIR', syscall: 'open', path: 'D:\\Work\\repo\\' } – Scott Jones May 23 '22 at 09:08
  • @ScottJones `EISDIR` means "Error: IS Directory": you're giving Node a directory when it expects a file. Just use `d:\work\repo\file.txt` for example – Ahmed Fasih May 23 '22 at 16:45
10
const {createWriteStream} = require('fs');
const {pipeline} = require('stream/promises');
const fetch = require('node-fetch');

const downloadFile = async (url, path) => pipeline(
    (await fetch(url)).body,
    createWriteStream(path)
);
Ihor Sakailiuk
  • 5,642
  • 3
  • 21
  • 35
  • I get error `TypeError: Cannot read property 'on' of undefined at destroyer (internal/streams/pipeline.js:23:10)` – Codler Oct 17 '20 at 07:35
3
import { existsSync } from "fs";
import { mkdir, writeFile } from "fs/promises";
import { join } from "path";

export const download = async (url: string, ...folders: string[]) => {
    const fileName = url.split("/").pop();

    const path = join("./downloads", ...folders);

    if (!existsSync(path)) await mkdir(path);

    const filePath = join(path, fileName);

    const response = await fetch(url);

    const blob = await response.blob();

    // const bos = Buffer.from(await blob.arrayBuffer())
    const bos = blob.stream();

    await writeFile(filePath, bos);

    return { path, fileName, filePath };
};

// call like that ↓
await download("file-url", "subfolder-1", "subfolder-2", ...)
  • 2
    Your answer could be improved by adding more information on what the code does and how it helps the OP. – Tyler2P Aug 09 '22 at 08:38
  • this will store the whole 100MB file in memory before writing it, which might work but you probably want to avoid that if possible – Andy Jun 06 '23 at 17:00
1

I was looking for kinda a same usage, wanted to fetch bunch of api endpoints and save the json responses to some static files, so I came up creating my own solution, hope it helps

const fetch = require('node-fetch'),
    fs = require('fs'),
    VERSIOINS_FILE_PATH = './static/data/versions.json',
    endpoints = [
        {
            name: 'example1',
            type: 'exampleType1',
            url: 'https://example.com/api/url/1',
            filePath: './static/data/exampleResult1.json',
            updateFrequency: 7 // days
        },
        {
            name: 'example2',
            type: 'exampleType1',
            url: 'https://example.com/api/url/2',
            filePath: './static/data/exampleResult2.json',
            updateFrequency: 7
        },
        {
            name: 'example3',
            type: 'exampleType2',
            url: 'https://example.com/api/url/3',
            filePath: './static/data/exampleResult3.json',
            updateFrequency: 30
        },
        {
            name: 'example4',
            type: 'exampleType2',
            url: 'https://example.com/api/url/4',
            filePath: './static/data/exampleResult4.json',
            updateFrequency: 30
        },
    ],
    checkOrCreateFolder = () => {
        var dir = './static/data/';
        if (!fs.existsSync(dir)) {
            fs.mkdirSync(dir);
        }
    },
    syncStaticData = () => {
        checkOrCreateFolder();
        let fetchList = [],
            versions = [];
        endpoints.forEach(endpoint => {
            if (requiresUpdate(endpoint)) {
                console.log(`Updating ${endpoint.name} data... : `, endpoint.filePath);
                fetchList.push(endpoint)
            } else {
                console.log(`Using cached ${endpoint.name} data... : `, endpoint.filePath);
                let endpointVersion = JSON.parse(fs.readFileSync(endpoint.filePath, 'utf8')).lastUpdate;
                versions.push({
                    name: endpoint.name + "Data",
                    version: endpointVersion
                });
            }
        })
        if (fetchList.length > 0) {
            Promise.all(fetchList.map(endpoint => fetch(endpoint.url, { "method": "GET" })))
                .then(responses => Promise.all(responses.map(response => response.json())))
                .then(results => {
                    results.forEach((endpointData, index) => {
                        let endpoint = fetchList[index]
                        let processedData = processData(endpoint.type, endpointData.data)
                        let fileData = {
                            data: processedData,
                            lastUpdate: Date.now() // unix timestamp
                        }
                        versions.push({
                            name: endpoint.name + "Data",
                            version: fileData.lastUpdate
                        })
                        fs.writeFileSync(endpoint.filePath, JSON.stringify(fileData));
                        console.log('updated data: ', endpoint.filePath);
                    })
                })
                .catch(err => console.log(err));
        }
        fs.writeFileSync(VERSIOINS_FILE_PATH, JSON.stringify(versions));
        console.log('updated versions: ', VERSIOINS_FILE_PATH);
    },
    recursiveRemoveKey = (object, keyname) => {
        object.forEach((item) => {
            if (item.items) { //items is the nesting key, if it exists, recurse , change as required
                recursiveRemoveKey(item.items, keyname)
            }
            delete item[keyname];
        })
    },
    processData = (type, data) => {
        //any thing you want to do with the data before it is written to the file
        let processedData = type === 'vehicle' ? processType1Data(data) : processType2Data(data);
        return processedData;
    },
    processType1Data = data => {
        let fetchedData = [...data]
        recursiveRemoveKey(fetchedData, 'count')
        return fetchedData
    },
    processType2Data = data => {
        let fetchedData = [...data]
        recursiveRemoveKey(fetchedData, 'keywords')
        return fetchedData
    },
    requiresUpdate = endpoint => {
        if (fs.existsSync(endpoint.filePath)) {
            let fileData = JSON.parse(fs.readFileSync(endpoint.filePath));
            let lastUpdate = fileData.lastUpdate;
            let now = new Date();
            let diff = now - lastUpdate;
            let diffDays = Math.ceil(diff / (1000 * 60 * 60 * 24));
            if (diffDays >= endpoint.updateFrequency) {
                return true;
            } else {
                return false;
            }
        }
        return true
    };

syncStaticData();

link to github gist

Hossein
  • 525
  • 3
  • 13
1

If you don't need to deal with 301/302 responses (when things have been moved), you can actually just do it in one line with the Node.js native libraries http and/or https.

You can run this example oneliner in the node shell. It just uses https module to download a GNU zip file of some source code to the directory where you started the node shell. (You start a node shell by typing node at the command line for your OS where Node.js has been installed).

require('https').get("https://codeload.github.com/angstyloop/js-utils/tar.gz/refs/heads/develop", it => it.pipe(require('fs').createWriteStream("develop.tar.gz")));

If you don't need/want HTTPS use this instead:

require('http').get("http://codeload.github.com/angstyloop/js-utils/tar.gz/refs/heads/develop", it => it.pipe(require('fs').createWriteStream("develop.tar.gz")));

angstyloop
  • 117
  • 6
0

This got the job done for me node 18 and presumably 16. Has only fs and node-fetch (probably works with other fetch libraries) as a dependency.

const fs = require('fs');
const fetch = require("node-fetch");
async function downloadImage(imageUrl){
    //imageurl https://example.com/uploads/image.jpg
    imageUrl = imageUrl.split('/').slice(-1) //image.jpg
    const res = await fetch(imageUrl);
    const fileStream = fs.createWriteStream(`./folder/${imageUrl}`);
    await new Promise((resolve, reject) => {
        res.body.pipe(fileStream);
        res.body.on("error", reject);
        fileStream.on("finish", resolve);
      });
  };

Previous top answer by @code_wrangler was split into a node 16 and 18 solution (this is like the 16 solution), but on Node 18 the Node 18 solution created a 0 byte file for me and cost me some time.

Greggory Wiley
  • 660
  • 6
  • 16