0

Here is the situation: I am new to node.js, I have a 40MB file containing multilevel json file like: [{},{},{}] This is an array of objects (~7000 objects). Each object has properties and a one of those properties is also an array of objects

I wrote a function to read the content of the file and iterate it. I succeeded to get what I wanted in terms of content but not usability. I thought that I wrote an async function that would allow node to serve other web requests while iterating the array but that is not the case. I would be very thankful if anyone can point me to what I've done wrong and how to rewrite it so I can have a non-blocking iteration. Here's the function that handles the situation:

    function getContents(callback) {

        fs.readFile(file, 'utf8', function (err, data) {

            if (err) {
                console.log('Error: ' + err);
                return;
            }

            js = JSON.parse(data);
            callback();
            return;
        });
    }

    getContents(iterateGlobalArr);

    var count = 0;
    function iterateGlobalArr() {

        if (count < js.length) {

            innerArr = js.nestedProp;

            //iterate nutrients
            innerArr.forEach(function(e, index) {

                //some simple if condition here

            });

            var schema = {
                //.....get props from forEach iteration
            }

            Model.create(schema, function(err, post) {

                if(err) {
                    console.log('\ncreation error\n', err);
                    return;
                }

                if (!post) {
                    console.log('\nfailed to create post for schema:\n' + schema);
                    return;
                }
            });

            count++;
            process.nextTick(iterateGlobalArr);
        }
        else {
            console.log("\nIteration finished");
            next();
        }

Just so it is clear how I've tested the above situation. I open two tabs one loading this iteration which takes some time and second with another node route which does not load until the iteration is over. So essentially I've written a blocking code but not sure how to re-factor it! I suspect that just because everything is happening in the callback I am unable to release the event loop to handle another request...

sunlover
  • 55
  • 6
  • [this](http://stackoverflow.com/questions/25191520/async-in-node-js) might help you. – Olimpiu POP Sep 12 '14 at 14:15
  • possible duplicate of [How to write asynchronous functions for Node.js](http://stackoverflow.com/questions/6898779/how-to-write-asynchronous-functions-for-node-js) – fmsf Sep 12 '14 at 14:44

2 Answers2

2

Your code is almost correct. What you are doing is inadvertently adding ALL the items to the very next tick... which still blocks.

The important piece of code is here:

          Model.create(schema, function(err, post) {

            if(err) {
                console.log('\ncreation error\n', err);
                return;
            }

            if (!post) {
                console.log('\nfailed to create post for schema:\n' + schema);
                return;
            }
        });

        // add EVERYTHING to the very same next tick!
        count++;
        process.nextTick(iterateGlobalArr);

Let's say you are in tick A of the event loop when getContents() runs and count is 0. You enter iterateGlobalArr and you call Model.create. Because Model.create is async, it is returning immediately, causing process.nextTick() to add processing of item 1 to the next tick, let's say B. Then it calls iterateGlobalArr, which does the same thing, adding item 2 to the next tick, which is still B. Then item 3, and so on.

What you need to do is move the count increment and process.nextTick() into the callback of Model.create(). This will make sure the current item is processed before nextTick is invoked... which means next item is actually added to the next tick AFTER the model item has been created... which will give your app time to handle other things in between. The fixed version of iterateGlobalArr is here:

 function iterateGlobalArr() {

    if (count < js.length) {

        innerArr = js.nestedProp;

        //iterate nutrients
        innerArr.forEach(function(e, index) {

            //some simple if condition here

        });

        var schema = {
            //.....get props from forEach iteration
        }

        Model.create(schema, function(err, post) {

            // schedule our next item to be processed immediately.
            count++;
            process.nextTick(iterateGlobalArr);

            // then move on to handling this result.
            if(err) {
                console.log('\ncreation error\n', err);
                return;
            }

            if (!post) {
                console.log('\nfailed to create post for schema:\n' + schema);
                return;
            }
        });
    }
    else {
        console.log("\nIteration finished");
        next();
    }
}

Note also that I would strongly suggest that you pass in your js and counter with each call to iterageGlobalArr, as it will make your iterateGlobalArr alot easier to debug, among other things, but that's another story.

Cheers!

JayKuri
  • 839
  • 5
  • 10
0

Node is single-threaded so async will only help you if you are relying on another system/subsystem to do the work (a shell script, external database, web service etc). If you have to do the work in Node you are going to block while you do it.

It is possible to create one node process per core. This solution would result in only blocking one of the node processes and leave the rest to service your requests, but this feature is still listed as experimental http://nodejs.org/api/cluster.html.

A single instance of Node runs in a single thread. To take advantage of multi-core systems the user will sometimes want to launch a cluster of Node processes to handle the load.

The cluster module allows you to easily create child processes that all share server ports.

pherris
  • 17,195
  • 8
  • 42
  • 58