0

its that time again when I'm clueless & come humbly to ask for help!

I am trying to download 4500 images at once, average 1mb size, all the images get created & download starts, after about 2gb downloaded (so half) some images are complete, some partial, some empty, task manager confirms the download stops suddenly.

network graph

What could possibly be the issue? No matter how much I wait, nothing happens, at least if I got an error I would try something else...

Please advice if possible, thank you!

//get all json files from a folder
const fs = require("fs");
const path = require("path");
const axios = require("axios");

let urlsArray = [];
const collection = "rebels";
const folder = collection + "_json";

const getFiles = (folder) => {
  const directoryPath = path.join(__dirname, folder);
  return fs.readdirSync(directoryPath);
};

const files = getFiles(folder);

//inside the folder there are json files with metadata
//for each json file parse it and get the image url

files.forEach((file) => {
  const filePath = path.join(__dirname, folder, file);
  const fileContent = fs.readFileSync(filePath, "utf8");
  const parsedJson = JSON.parse(fileContent);
  const imageFromMetadata = parsedJson.image;
  const url = imageFromMetadata.replace("ipfs://", "https://ipfs.io/ipfs/");
  let nr = file.replace(".json", "");
  urlsArray.push({ url, nr });
});

//foreach url create a promise to download with axios

const downloadImage = (url, nr) => {
  const writer = fs.createWriteStream(
    process.cwd() + `/${collection}_images2/${nr}.png`
  );

  return axios({
    url,
    method: "GET",
    responseType: "stream",
  }).then((response) => {
    return new Promise((resolve, reject) => {
      response.data.pipe(writer);
      writer.on("finish", resolve);
      writer.on("error", reject);
    });
  });
};

const promiseAll = async () => {
  const promises = urlsArray.map((data) => {
    console.log(`trying to download image nr ${data.nr} from ${data.url}`);
    return downloadImage(data.url, data.nr);
  });

  await Promise.allSettled(promises);
};

promiseAll();

//download all
Phil
  • 157,677
  • 23
  • 242
  • 245
ThomasDEV
  • 75
  • 7
  • 1
    Do you have to use Node / Axios for this? Seems like something you could knock out with a shell script and curl – Phil Dec 22 '22 at 00:33
  • 1
    How many files are in the folder you're calling this on? I ask because you're doing all of these in parallel and could be running out of resources or something timing out trying to do this many in parallel. If you're bandwidth limited, it may not be helping you to do more than two in parallel. – jfriend00 Dec 22 '22 at 01:02
  • 1
    You also aren't logging any errors that might have occurred in `Promise.allSettled(promises)`. Because you're using `Promise.allSettled()`, no error would automatically be displayed to you. Instead, you have to iterate through the results and see if there are errors in the results. – jfriend00 Dec 22 '22 at 01:06

1 Answers1

1

Since Promise.allSettled() never rejects, nothing in your code will report on any rejected promises that it sees. So, I'd suggest you iterate its results and see if you have any rejected promises there.

You can do that like this:

const results = await Promise.allSettled(promises);
console.log(`results.length = ${results.length}`);
for (const r of results) {
    if (r.status === "rejected") {
         console.log(r.reason);
    }
}
console.log("all done");

This will verify that you got through the end of the Promise.allSettled(promises) and will verify that you got non-zero results and will log any rejected promises you got.

jfriend00
  • 683,504
  • 96
  • 985
  • 979
  • Thanks for your help, I abandoned this idea, I tried to download around 5k images from IPFS but its just impossible/too slow, sadly I am having more success with downloading them one by one & constantly retrying... BUT your syntax is wonderful & helped me improve a bit, thanks, all the best & happy holidays! – ThomasDEV Dec 22 '22 at 19:24
  • @ThomasDEV - Yeah, downloading 5k images at once is too much of a strain on either your local system or the host server or both. You are probably better off downloading a handful at a time (like 5-10). You can look at `mapConcurrent()` [here](https://stackoverflow.com/questions/46654265/promise-all-consumes-all-my-ram/46654592#46654592) which lets you download N at a time where you configure what you want N to be. – jfriend00 Dec 22 '22 at 20:20
  • who knows what went wrong, after around 2gb the network just dropped instantly like in the screenshot above, my PC ram/cpu was fine 32gb ram so not an issue there. IPFS is just horrible to work with really... so hard to download a whole folder, even their app couldn't do it – ThomasDEV Dec 23 '22 at 02:00
  • @ThomasDEV - When you say "network dropped" what exactly do you mean? Do you mean that all your connections to your host were closed and you couldn't make any more? If so, that's probably the target host rate limiting you and shutting you down from further requests because you're doing too many at once or too many too fast. – jfriend00 Dec 23 '22 at 03:01