9

The following piece of code creates a text file and then reads it, overwrites it, and reads it again. Except the creation of the file the three I/O operations are performed using Node.js async readFile and writeFile.

I don't understand why the first read is returning no error but no data either. The output of this code is:

  • Starting...
  • Done.
  • first read returned EMPTY data!
  • write finished OK
  • second read returned data: updated text

Even if the operations were to happen in an arbitrary order (due to their async nature) I would have NOT expected to get an "empty data" object.

Any ideas why I am getting an empty data when reading the file (and no error) ?

Is there anything that I can do to make sure the file content is read?

var fs = require('fs');
var fileName = __dirname + '/test.txt';

// Create the test file (this is sync on purpose)
fs.writeFileSync(fileName, 'initial test text', 'utf8');


console.log("Starting...");

// Read async
fs.readFile(fileName, 'utf8', function(err, data) {
    var msg = "";
    if(err)
        console.log("first read returned error: ", err);
    else {
        if (data === null) 
            console.log("first read returned NULL data!");
        else if (data === "") 
            console.log("first read returned EMPTY data!");
        else
            console.log("first read returned data: ", data);
    }
});


// Write async
fs.writeFile(fileName, 'updated text', 'utf8', function(err) {
    var msg = "";
    if(err)
        console.log("write finished with error: ", err);
    else
        console.log("write finished OK");
});


// Read async
fs.readFile(fileName, 'utf8', function(err, data) {
    var msg = "";
    if(err)
        console.log("second read returned error: ", err);
    else
        if (data === null) 
            console.log("second read returned NULL data!");
        else if (data === "") 
            console.log("second read returned EMPTY data!");
        else
            console.log("second read returned data: ", data);
});


console.log("Done.");
Hector Correa
  • 26,290
  • 8
  • 57
  • 73

2 Answers2

18

Your code is asking for race conditions. Your first sync write is probably writing the file, but then your first read, second write, and second read are put onto the event loop simultaneously.

What could have happened here? First read gets read permission from the filesystem, second write gets write permission from the filesystem and immediately zeroes the file for future updating, then the first read reads the now empty file. Then the second write starts writing data and the second read doesn't get read permission until it's done.

If you want to avoid this, you need to use the flow:

fs.writeFileSync(filename, 'initial', 'utf8');
fs.readFile(filename, 'utf8', function(err, data) {
    console.log(data);
    fs.writeFile(filename, 'text', 'utf8', function(err) {
        fs.readFile(filename, 'utf8', function(err, data) {
            console.log(data);
        });
    });
});

If that "pyramid" insults your programming sensibilities (why wouldn't it?) use the async library's series function:

fs.writeFileSync(filename, 'initial', 'utf8');
async.series([
    function(callback) {
        fs.readFile(filename, 'utf8', callback);
    },
    function(callback) {
        fs.writeFile(filename, 'text', 'utf8', callback);
    },
    function(callback) {
        fs.readFile(filename, 'utf8', callback);
    }
], function(err, results) {
    if(err) console.log(err);
    console.log(results); // Should be: ['initial', null, 'text']
});

EDIT: More compact, but also more "magical" to people not familiar with the async library and modern Javascript features:

fs.writeFileSync(filename, 'initial', 'utf8');
async.series([
    fs.readFile.bind(this, filename, 'utf8'),
    fs.writeFile.bind(this, filename, 'text', 'utf8'),
    fs.readFile.bind(this, filename, 'utf8'),
], function(err, results) {
    if(err) console.log(err);
    console.log(results); // Should be: ['initial', null, 'text']
});

EDIT2: Serves me right for making that edit without looking up the definition of bind. The first parameter needs to be the this object (or whatever you want to use as this).

ArtOfCode
  • 5,702
  • 5
  • 37
  • 56
  • But I thought I could not get race conditions in node.js (???) Since JavaScript is single threaded how could the write execute in the middle of the read? Are readFile and writeFile not atomic? – Hector Correa Apr 29 '12 at 01:24
  • 3
    Javascript is single threaded *within the Javascript Event Loop*. Anything that crosses the Event Loop is *asking* native code in *another thread* to do something, and you have *no guarantees* that you'll have synchronous behavior. You also have no guarantee that your callbacks will be called in the order they're specified -- they're called only when the underlying code is done, so it could very well be 2, 1, 3 for the callback response order. (I don't think that's the case here, though, we're dealing with side effects of I/O operations, here.) –  Apr 29 '12 at 01:29
  • 2
    @HectorCorrea you are unfortunately mistaken if you think you can't get race conditions in a framework designed for evented IO. node.js runs on a single thread, but hands off writes to the operating system and receives events to continue processing. This means after your first `readFile` is handed to the OS, `writeFile` executes and `readFile` later receives the `data` event (from the OS at an indeterminate time later). – Joseph Yaduvanshi Apr 29 '12 at 01:29
  • @DavidEllis thanks for pointing out that async library. It looks pretty awesome. – Joseph Yaduvanshi Apr 29 '12 at 01:30
  • @JimSchubert, it really needs to be on page two of any Node.js tutorial (just after introducing Continuation Passing Style). It makes your code much more understandable, and ``async.auto`` is awesome for really complex interactions over the network. –  Apr 29 '12 at 01:34
  • @DavidEllis I was aware that the order could be arbitrary but I was not expecting readFile and writeFile not to be atomic. Even if they would execute in order 2, 1, 3 as you indicated, I would have expected the step 2 (the call to writeFile) to execute entirely before yielding back to step 1 (the call to readFile.) It didn't dawn on me that the I/O is happening in another thread. – Hector Correa Apr 29 '12 at 01:37
  • So is there no way of knowing if the data that I got in readFile is complete? Yikes! – Hector Correa Apr 29 '12 at 01:38
  • 1
    @HectorCorrea, maybe if you used ``fs.open`` and [``fs.fsync``](http://nodejs.org/api/fs.html#fs_fs_fsync_fd_callback) to force the status to be fully written? –  Apr 29 '12 at 01:50
  • Thanks, I'll give that a shot. I definitively learned something today :) – Hector Correa Apr 29 '12 at 01:59
  • @DavidEllis Your were absolutely right in your initial assessment. As it turns out, internally fs.writeFile() makes a call to fs.open(fileName,'w') and then loops to re-write the contents of the file (https://github.com/joyent/node/blob/master/lib/fs.js). According to the documentation the call to open(fileName, 'w') truncates the file http://nodejs.org/api/fs.html#fs_fs_open_path_flags_mode_callback. – Hector Correa Apr 29 '12 at 03:58
  • I'm glad my senior design project has had some practical use for me. (Wrote a FUSE application to convert a networking protocol we designed into a mountable file system.) –  Apr 29 '12 at 04:22
1

I had a similar problem. I was writing text to a file and had a change-handler telling me when the file had changed at which point I tried to read it ASYNC to process the new content of the file further.

Most of the time that worked but in some cases the callback for the ASYNC-read returned an empty string. So perhaps the changed-event happened before the file was fully written so when I tried to read it I got empty string. Now one could have hoped that the ASYNC read would have recognized that the file is in the process of being written and thus should wait until the write-operation was completed. Seems that in Node.js writing does not lock the file from being read so you get unexpected results if you try to read while write is going on.

I was able to GET AROUND this problem by detecting if the result of ASYNC read was empty string and if so do an additional SYNC-read on the same file. That seems to produce the correct content. Yes SYNC-read is slower, but I do it only if it seems that the ASYNC-read failed to produce the expected content.

Panu Logic
  • 2,193
  • 1
  • 17
  • 21