3

I am experiencing what seems to be a traditional problem for Node JS beginners, and async requests.

I have an unknown number of URLs generate by the user, and subsequently stored in an array on my Node JS server. The Node JS server must iterate through these URLs, making a request to each one in turn. It must do so in order, and it must wait for each response before moving onto the next URL (when a new request will be made). The final result should be the in-order collection of all the responses (which happen to be JSON), nicely stored together as a JSON object, which in turn can be sent back to the client when ready.

I think I should use the async NodeJS library, and I am already using needle for making the requests.

URLs = ["http://a", "http://s", "http://d"];
async.eachSeries(URLs, function (URL, callback) { ..... });

I'm not clear how to use async to ensure that the Needle request has finished, and store that response accordingly before moving onto the next URL request. Below is an example of my Needle request.

 needle.get(URL, options, function(error, response, body){ ... });

Either a partial or complete solution to the whole problem is welcome.

jtromans
  • 4,183
  • 6
  • 35
  • 33

2 Answers2

10

With promises you could do that with:

var Promise = require("bluebird");
var get = Promise.promisify(needle.get, needle);

var URLs = ["http://a", "http://s", "http://d"];
var current = Promise.fulfilled();
Promise.map(URLs, function (URL) {
    current = current.then(function () {
        return get(URL);
    });
    return current;
}).map(function(responseAndBody){
    return JSON.parse(responseAndBody[1]);
}).then(function (results) {
    console.log(results);
}).catch(function (e) {
    console.error(e);
});

As a bonus, your server won't crash when the websites have invalid json or respond with error message/empty body. When writing by hand, you would need manual try catches for that but promises handle both kinds of errors in the catch(). Since the urls are given by the user, they can easily DoS your server if you don't add manual try-catch to non-promise code.

Esailija
  • 138,174
  • 23
  • 272
  • 326
  • Thank you for offering this solution. I will give this a go. I'm marking Plato's response as 'correct' because it sticks to the libraries I mentioned in the question. Whether they are the best tool(s) for the job is surely up for debate. If I could mark both as correct, I would. – jtromans Oct 16 '13 at 07:36
  • @jtromans sure. Btw, if you cannot get the above to work as intended, please let me know. – Esailija Oct 16 '13 at 08:30
  • @Esailija any reason you have set `var current = Promise.fulfilled() ` and used it `Promise.map` ? we could just `return get(URL)` in the first map function ? – abhilash Mar 30 '16 at 06:57
4

here are two examples, one that saves the results one by one with async.eachSeries and one that collects all results with async.mapSeries and then saves them all at once

URLs = ["http://a", "http://s", "http://d"];
function iterator1(URL, done){
  var options = {};
  needle.get(URL, options, function(error, response, body){ 
    if(error){ return done(error) };
    processAndSaveInDB(body, function(err){
      if(err){ return done(err) };
      done(null);
    });
  });
};

async.eachSeries(URLs
, iterator1
, function (err){
  // global callback for async.eachSeries
  if(err){ 
    console.log(err) 
  } else {
    console.log('All Needle requests successful and saved');
  }
});

// Here is a similar technique using async.map, it may be more suitable
function iterator2(URL, done){
  var options = {};
  needle.get(URL, options, function(error, response, body){ 
    if(error){ return done(error) };
    done(null, body);
  });
};

async.mapSeries(URLs
, iterator2
, function (err, results){
  // global callback for async.mapSeries
  if(err){ 
    console.log(err) 
  } else {
    console.log('All Needle requests successful');
    // results is a 1 to 1 mapping in order of URLs > needle.body
    processAndSaveAllInDB(results, function(err){
      if(err){ return done(err) };
      console.log('All Needle requests saved');
      done(null);
    });
  }
});

I'm not clear how to use async to ensure that the Needle request has finished, and store that response accordingly before moving onto the next URL request.

The series variants of the async functions take care of this; you just make sure not to call the done callback of your iterator functions until you are ready to proceed. In practice this means placing the call to done in your innermost callback (e.g. your Needle callback)

Plato
  • 10,812
  • 2
  • 41
  • 61
  • Upon rereading your question, the `mapSeries` solution is probably more useful; you would replace my dummy `processAndSaveAllInDB` with `concatenateJSONResults` (which probably can be synchronous), and then send your response – Plato Oct 15 '13 at 16:08
  • Tried both, and will go for the mapSeries option. Thank you very much. I have some other paradigms when I'm using reciprocal functions to achieve similar types of results, so this method is going to come in very useful moving forward. – jtromans Oct 16 '13 at 07:34