0

I have a tiny server I'm trying to fetch a bunch of files from; ~100 to fetch. I'm using JS to fetch the files using a forEach loop. Unfortunately, the ~100 requests knocks over the server.

    links.forEach((link) => {
        const name = link.split("/")[5];
        const file = fs.createWriteStream(name);
        const request = http.get(link, (res) => {
            res.pipe(file);
            file.on("finish", () => {
                file.close();
            });
        });
    });

I'm trying to write synchronous blocking JavaScript; I want the file writing to complete before beginning the next fetching action. I've been experimenting with generators, while loops, and fs.writeFileSync. I've also been using setTimeout to emulate a slow network call (just so I don't have to reset the server after knocking it over each time).

It seems like I'm missing some concept. Here's my very naive experiment that I thought should take 3 seconds, but only takes 1. After thinking about it, it's clear all the timeouts are happening at once.

function writeSlow(path) {
    setTimeout(function () {
        console.log("write now");
        fs.writeFileSync(path, "lorem");
    }, 1000);
}
writeSlow("junk/one");
writeSlow("junk/two");
writeSlow("junk/three");

I did some reading and was convinced using Promises was the way to go, but this doesn't appear to work either:

function sleep(ms) {
    return new Promise((resolve) => setTimeout(resolve, ms));
}

async function run() {
    arr.forEach(async (str) => {
        await sleep(1000);
        fs.writeFileSync("junk/" + str + ".txt", "lorem");
    });
}

My expectation, or what I'm trying to get to, with the experimental code, is the point where I can watch the filesystem, and every second, a new file appears.

(edit) The actual end result I'm looking for is for the next http request to only fire once the last one completes.

icicleking
  • 1,029
  • 14
  • 38
  • 1
    To my understanding, `setTimeout` delays the execution, but it will delay all the calls so they will all happen after 1 second. – A.L Jun 25 '23 at 13:03
  • The HTTP requests are asynchronous; that `.get()` call will not invoke the callback function until the network response is received. The best you can do is use `writeFileSync()` in the callback. – Pointy Jun 25 '23 at 13:10
  • @A.L That's a good explanation, the link you point to creates increasingly longer setTimeouts, while that fits the expectation I wrote for the experimentation, what I'm really looking for is for the next fetch to occur after the last one completes. I'll update the question – icicleking Jun 25 '23 at 13:16
  • 1
    *"I'm trying to write synchronous blocking JavaScript"*: that's a wrong target. Instead embrace asynchronous coding patterns. – trincot Jun 25 '23 at 13:17
  • @trincot as you can read in the post, asynchronous requests knock over the tiny server I'm requesting. – icicleking Jun 25 '23 at 13:25
  • @subdeveloper This looks very promising. I'll update if I get it to work – icicleking Jun 25 '23 at 13:26
  • Using `async` and `await` makes it somewhat easier to "schedule" async stuff the way you want: either "launch everything" or one-by-one. – Pointy Jun 25 '23 at 13:31

2 Answers2

1

You could write an asynchronous loop:

function loop(links, i) {
    if (i >= links.length) return; // all done
    const link = links[i];        

    const name = link.split("/")[5];
    const file = fs.createWriteStream(name);
    const request = http.get(link, (res) => {
        res.pipe(file);
        file.on("finish", () => {
            file.close();
            loop(links, i+1); // Continue with next
        });
    });
}
// Start the asynchronous loop:
loop(links, 0);

Alternatively you could switch to promise-based libraries, or else, promisfy the callback-based functions you have, like so (not tested):

// Promisify your callback-based functions
const asyncWrite = (name, res) => new Promise((resolve) => {
    const file = fs.createWriteStream(name);
    res.pipe(file);
    file.on("finish", () => {
        file.close();
        resolve();
    });
});

const asyncHttpGet = (link) => new Promise((resolve) =>
    http.get(link, resolve)
);

// ... and use them:
async function writeAllLinks(links) {
    for (const link of links) {
        await asyncWrite(link.split("/")[5], await asyncHttpGet(link));
    }
}
// Start the asynchronous loop:
writeAllLinks(links).then( /* .... */ );

You would probably want to add error handling...

trincot
  • 317,000
  • 35
  • 244
  • 286
0

forEach doesn't behave like you may think it will with async/await. Take a look at this post on SO for more information:Using async/await with a forEach loop

Instead, in your case (as you will see above), you can use a standard for loop:

function sleep(ms, fileName) {
  return new Promise(resolve => setTimeout(() => resolve(fileName), ms))
}

const files = ["file1", "file2", "file3"];

async function run() {
  for(let i = 0; i < files.length; i ++) {
    const fileName = await sleep(1000, files[i])
    console.log(fileName, new Date().toLocaleTimeString())
  }
}

run();
Brandon
  • 1,747
  • 2
  • 13
  • 22