1

I'm using NodeJS to walk over a list of files and generate an MD5 hash for each one. Here's how I would normally do this synchronously:

// Assume files is already populated with an array of file objects
for(file in files) {
   var currentFile = files[file];
   currentFile.md5 = md5(file.path);
}

The problem here is that the MD5 function is asynchronous and actually has a callback function that is runs once the MD5 hash has been generated for the file. Thus, all of my currentFile.md5 variables are just going to be set to undefined.

Once I have gotten all of the MD5 hashes for all of the files I'll need to move onto another function to deal with that information.

How gnarly is the code going to get in order for me to do this asynchronously? What's the cleanest way to accomplish what I want to do? Are there common different approaches that I should be aware of?

Kirk Ouimet
  • 27,280
  • 43
  • 127
  • 177

4 Answers4

2

To call an async function multiple times, you should make a function and call it in recursion like this.

I have assumed your md5 function has a callback with two params err and result.

var keys = Object.keys(files); // taking all keys in an array.

function fn() {
    var currentFile = files[keys.shift()];
    md5(currentFile, function (err, result) {
            // Use result, store somewhere

            // check if more files
        if (keys.length) {
            fn();
        } else {
            // done
        }
    });
}
Salman
  • 9,299
  • 6
  • 40
  • 73
  • Nice solution, but any node.js answer with a "store somewhere" comment and a "done" comment doesn't help OP, who asked "Need Help _Thinking_ How to Program Asynchronously". It's the whole "store somewhere" and "done" things that are his problems! With a few modifications you could make your good answer into a great one. It's the same with a lot of other node demo code which calls console.log(data) instead of doing something _real_ with the data. It's really annoying. – user949300 Jan 23 '14 at 17:00
  • Well, the OP asked about asynchronous nature of node.js, and the answer I gave specifically emphasizes on the same. Mentioning repetitive things in the answers is something that you should avoid, there are dozens of answers which show how to write to file or store in db. and if he has any concern regarding the code, he is free to ask further. Remember, SO is just for helping people go into right direction. PS: If that's what you think, you shouldn't have given the pure "Algorithmic" answer. – Salman Jan 23 '14 at 17:09
1

One great approach is to use async. (Search on npm)

If you want to roll your own

  1. Count the files, put that in a var
  2. Everytime fs opens a file and calls your intermediate callback, compute and store the MD5
  3. Also, decrement that counter.
  4. When counter === 0, call a "final" callback, passing back all the MD5s.
user949300
  • 15,364
  • 7
  • 35
  • 66
1

To answer your questions (theoretically), in Javascript world, there are (at the moment) 2 different ways to deal with asynchronous code

  • Using callbacks. This is the most basic way that people start using Javascript know. However , there are plenty of libraries to help people deal with callback in a less painful way such as async, step. In your particular problem. Assuming that md5 is somehow weirdly asynchronous, you can use https://github.com/caolan/async#parallel to achieve it

  • Another way is to use promise, there are also plenty of promise-compliant libraries such as q, when. Basically, with a promise you have a nicer way to organize your code flow (IMO). With the problem above you can use when.all to gather the result of md5. However, you need to turn md5 into a promise-compliant function

Tan Nguyen
  • 3,354
  • 3
  • 21
  • 18
1

To avoid "callback hell" you should introduce the world of promises to your Node toolset. I suggest q https://npmjs.org/package/q

Here is a post on SO that can help and give you an idea of the syntax how to use q.js promises to work with multiple asynchronous operations.

You essentially would run all your async functions with defered promises, the .then() chained method would fire when all promises are resolved and the function passed inside then() can process your MD5'd data.

I hope this helps.

Community
  • 1
  • 1
Draculater
  • 2,280
  • 1
  • 24
  • 29