3

I've been playing with jsdom, which is a module for node.js. This following code on the bottom is from their documentation page. My problem is how to return something from asynchronous functions.

I know that this is a question that is asked a lot, probably by me as well. I also know that callbacks are a good friend when it comes to these kind of problems. My goal here is to find a workaround which might act like a cookie or a Session variable in PHP, in order to transfer that little bit of data to the outer scope, outside the asynchronous function. Then it should be accessible once the data is set from the outer scope.

First thing I want to know is:

  1. Is there already a way to store data somewhere like a cookie or session that exists in the outer scope and accessible once I've done what I had to do?
  2. If I was to write the data to a file, in point B in the code, and read it at point C, wouldn't I have to write a some sort of timeout function to wait for a few seconds before reading the file? My experience in working with Asynchronous functions in nodejs has sometimes shown that I had to wait a few seconds before the writing process was done before trying to read it. Would this be the case here too? If yes, wouldn't it mean it would have to happen if where I saved the data was memory?
  3. If I was to write a c++ plugin for this purpose, that acted as a separate data bay where we could save(data) at point B to the memory and retrieve(data) at point C from the memory; would this work?

Honestly, I do not like writing temporary files to work around the asynchronous functions. I have been looking for a simple yet effective way to pass data around, but I need some guidance from experienced programmers like you to surpass the unnecessary approaches to this problem.

If you could toss around some ideas for me, stating what might work and what might not work, I'd appreciate it.

Here's the example code:

// Print all of the news items on hackernews
var jsdom = require("jsdom");
// var result; 
// A) Outer Scope: Since the function is async in done, storing the result here and echoing in point C is pointless.
jsdom.env({
  html: "http://news.ycombinator.com/",
  scripts: ["http://code.jquery.com/jquery.js"],
  done: function (errors, window) {
    var $ = window.$;
    console.log("HN Links");
    $("td.title:not(:last) a").each(function() {
      console.log(" -", $(this).text());
    });
    // B) let's say I want to return something I've scavenged here.
    // result = $("a");
  }
});
// C) 
// console.log(result)
Logan
  • 10,649
  • 13
  • 41
  • 54
  • 1
    Synchronous world: the waiter takes your order, goes to the kitchen, waits for your order, brings it to you and goes take another order. Asynchronous world: the waiter takes your order, goes to the kitchen, goes take other orders, and brings your order to you when it's ready. – Florian Margaine May 26 '13 at 08:08

2 Answers2

7

You need to clear your head of your synchronous experience that thinks code lower in the file happens later in time. It does not necessarily do that in node, ever. Here's the deal. In node, you place orders like at a restaurant, and you don't do it like:

1. Order a salad
2. Wait 11 minutes
3. Eat the salad

You do it like this

1. Order a salad
2. Wait for the server to serve you the salad
3. Eat the salad

The first example is a race condition and a terrible bug in your program that will cause either the salad waiting around to be eaten for no reason or trying to eat a salad that isn't there yet.

Don't think "I want to return something here", think "this data is ready". So you can have:

function eatSalad() {...}
placeOrder("salad", eatSalad);

Where eatSalad is the callback for the placeOrder routine, which does the necessary I/O to get the salad. Notice how even though eatSalad is earlier in the file, it happens later chronologically. You don't return things, you invoke callbacks with data you have prepared.

Here's your snippet made asynchronous.

// Print all of the news items on hackernews
var jsdom = require("jsdom");
// var result; 
// A) Outer Scope: Since the function is async in done, storing the result here and echoing in point C is pointless.
jsdom.env({
  html: "http://news.ycombinator.com/",
  scripts: ["http://code.jquery.com/jquery.js"],
  done: function (errors, window) {
    var $ = window.$;
    console.log("HN Links");
    $("td.title:not(:last) a").each(function() {
      console.log(" -", $(this).text());
    });
    // B) let's say I want to return something I've scavenged here.
    // result = $("a");
    resultIsReady($("a"));
  }
});

function resultIsReady(element) {
    console.log(element);
}

EDIT TO ADD to answer your question from the comments, node code will generally be built up not from functions that return things, but from functions that invoke callback functions with their "return value". The return keyword is only used to actually return a value for in-memory code that does not do any I/O. So finding the mean of an in-memory array can just return it, but finding the mean from a database result set must invoke a callback function. The basic paradigm is built up your programs from functions like this (pseudocode DB library):

function getUser(email, callback) {
    db.users.where({email: email}, function (error, user) {
        if (error) {
            //This return statement is just for early termination
            //of the function. The value returned is discarded
            return callback(error);
        }
        callback(null, user);
    });;
}

So that's how you do things. And typically functions like this do a very limited number of IO calls (1 or 2 are common, then you start falling into nesting hell and need to refactor).

I personally write lots of functions like that and then use the async library to describe the higher-order sequence of things that needs to happen. There are lots of other popular flow control libraries as well, and some people like the promises pattern. However, at the moment some of the core node community members seem to be advocating callbacks as the one true way and promises seem to be falling out of favor.

Peter Lyons
  • 142,938
  • 30
  • 279
  • 274
  • Thanks for the great answer! It has shown me why people end up with a tangled up nested within nested functions... I guess I was thinking more like PHP or C# where I could *centralize* more. What I mean is that writing lots of different small functions which return something, that are then used in a function *bigger in the food chain* to control the smaller ones and return a collection of results. What is the name of this approach? And should I stay away from that form of thinking in nodejs? Any advice? – Logan May 26 '13 at 14:49
  • Thanks again for the update. Your response is very dense and informative. It has given me the basic understanding of code structure which I should plan for in Node.js. However, your final statement has made me curious. What is the general opposing view towards callbacks? Why does it seem to be falling out of favor? Any sources you can provide would help me with my research. – Logan May 26 '13 at 22:43
  • 1
    That's a very topical cultural thing. I think for the most part it is because the core node API as designed by Ryan Dahl is based on callbacks and does not make use of promises. Promises became very popular when jquery started using them and so they are definitely pretty widely used in node as well, but I think some of the node thought leadership has decided they are more of a fad or personal choice and create unnecessary inconsistency with the core node API, so callbacks seem to be now preferred. This is just my take on the news though. – Peter Lyons May 27 '13 at 06:41
3

Avoid using synchronous code in any place where there is a chance for blocking the execution, such as database operations, file IO, network data retrieval, long calculations etc. In your example use callback when you finish your computation to continue with execution, you could also take a look at async https://npmjs.org/package/async library which is de facto standard for hairier calls :

    function sendData(result) {
        res.json(result);
    }

    var jsdom = require("jsdom");
    // var result;
    // A) Outer Scope: Since the function is async in done, storing the result here 
    // and echoing in point C is pointless.
    jsdom.env({
        html: "http://news.ycombinator.com/",
        scripts: ["http://code.jquery.com/jquery.js"],
        done: function (errors, window) {
            var $ = window.$;
            console.log("HN Links");
            $("td.title:not(:last) a").each(function () {
                console.log(" -", $(this).text());
            });
            // B) let's say I want to return something I've scavenged here.
            var result = $("a");
            sendData(result);
        }
    });
slobodan.blazeski
  • 1,040
  • 9
  • 20
  • Thank you so much, your solution really helped me on this one: http://stackoverflow.com/questions/21581255/return-variable-outside-of-a-function-scope-in-javascript – Aerodynamika Feb 05 '14 at 17:32