16

I see a lot of synchronous functions in the file system library. such as fs.readFileSync(filename, [options]).

How (and why) are these functions implemented if node has async/non-blocking IO and no sleep method - and can I use the same mechanism to implement other synchronous functions?

guy mograbi
  • 27,391
  • 16
  • 83
  • 122
  • Good question! maybe one would need to reinvent coroutines for javascript and expose event loop to allow writing synchronous non blocking code in node.js . – Alex Oct 03 '16 at 16:53

1 Answers1

22
fs.readFileSync()

is really just a wrapper for the

fs.readSync() 

function. So the question is how is fs.readSync() implemented compared to fs.read(). If you look at the implementations of these two functions they both take advantage of the bindings module. Which in this case is intialized to

var binding = process.binding('fs').  

and the calls are

binding.read(fd, buffer, offset, length, position, wrapper);//async
var r = binding.read(fd, buffer, offset, length, position);//sync

Respectively. Once we're in the "binding" module, we are out in v8, node_#####.cc land. The implementation of binding('fs') can be found in the node repository code, in node_file.cc. The node engine offers overloads for the C++ calls, one taking a callback, one that does not. The node_file.cc code takes advantage of the req_wrap class. This is a wrapper for the v8 engine. In node_file.cc we see this:

#define ASYNC_CALL(func, callback, ...)                           \
  FSReqWrap* req_wrap = new FSReqWrap(#func);                     \
  int r = uv_fs_##func(uv_default_loop(), &req_wrap->req_,        \
      __VA_ARGS__, After);                                        \
  req_wrap->object_->Set(oncomplete_sym, callback);               \
  req_wrap->Dispatched();                                         \
  if (r < 0) {                                                    \
    uv_fs_t* req = &req_wrap->req_;                               \
    req->result = r;                                              \
    req->path = NULL;                                             \
    req->errorno = uv_last_error(uv_default_loop()).code;         \
    After(req);                                                   \
  }                                                               \
  return scope.Close(req_wrap->object_);

#define SYNC_CALL(func, path, ...)                                \
  fs_req_wrap req_wrap;                                           \
  int result = uv_fs_##func(uv_default_loop(), &req_wrap.req, __VA_ARGS__, NULL); \
  if (result < 0) {                                               \
    int code = uv_last_error(uv_default_loop()).code;             \
    return ThrowException(UVException(code, #func, "", path));    \
  }

Notice that the SYNC_CALL uses a different req-wrap. Here is the code for the relevant req_wrap constructor for the ASYNC method, found in req_wrap.h

ReqWrap() {
    v8::HandleScope scope;
    object_ = v8::Persistent<v8::Object>::New(v8::Object::New());

    v8::Local<v8::Value> domain = v8::Context::GetCurrent()
                                  ->Global()
                                  ->Get(process_symbol)
                                  ->ToObject()
                                  ->Get(domain_symbol);

    if (!domain->IsUndefined()) {
      // fprintf(stderr, "setting domain on ReqWrap\n");
      object_->Set(domain_symbol, domain);
    }

    ngx_queue_insert_tail(&req_wrap_queue, &req_wrap_queue_);
  }

Notice that this function is creating a new v8 scope object to handle the running of this event. This is where the asynchronous portion of async stuff happens. The v8 engine launches a new javascript interpreting environment to handle this particular call separately. In short, without building/modifying your own version of node, you cannot implement your own asynchronous/synchronous versions of calls, in the same way that node does. That being said, asynchronous really only applies to I/O operations. Perhaps a description of why you think you need things to be more synchronous would be in order. In general, if you believe node doesn't support something you want to do, you just aren't embracing the callbacks mechanism to it's full potential.

That being said, you could consider using the events node module to implement your own event handlers if you need async behavior. And you can consider native extensions if there are things you desperately need to do synchronously, however, I highly recommend against this. Consider how you can work within the asynchronous event loop to get what you need to do done this way. Embrace this style of thinking, or switch to another language.

Forcing a language to handling things a way it doesn't want to handle them is an excellent way to write bad code.

MobA11y
  • 18,425
  • 3
  • 49
  • 76
  • 1
    Very thorough answer. `forcing a language to handling things a way it doesn't want to handle them` - is an interesting phrasing. Isn't that exactly what the nodeJS people are doing with `fs.readFileSync`? Why does it exist at all? Maybe there is room for synchronous calls in node after all? – guy mograbi Jun 28 '13 at 09:40
  • Stackoverflow... never know when you're handing someone a tool they aren't ready to use. That last sentence(why pick on one opinionated sentence in a pool of facts???) is a disclaimer... hey, I don't support the idea of trying to modify Node! But if you must, this is how you do it. You may be right, there might be room for more sync calls and to let developers decide which to use, but right now, the developers of Node have decided that this is not the case, which is exactly my point. – MobA11y Jun 28 '13 at 17:39
  • As per your questions, synchronous FS methods exist for two reasons. A: gathering configuration information as part of an initialization step is a relatively common need, and simpler synchronously. B: a series synchronous disk requests under certain circumstances are actually more performant than flooding the disk with a bunch of async requests it isn't prepared to handle. The same is true with server IO(DDOS attacks) it is just an infinitely(like 10000x) harder circumstance to create. However, it is very easy for a single process to overload a disk drive and slow itself down exponentially. – MobA11y Jun 28 '13 at 17:40
  • http://stackoverflow.com/a/15903421/1068746 - just seems to me that if there's an FS justification there should be a mysql one. Never mind. I will let it go for now. accepted your question. Very professional, thanks. – guy mograbi Jun 29 '13 at 11:06
  • The problem in ALWAYS developing asynchronous code is that the debugger (at least in IntelliJ) is not 'asynchronously aware'. Ideally the debugger would be able to 'step in' to the callback function. Currently this is a manual process of repeatedly setting breakpoints . Few people would want to debug synchronous (stack-based) systems if the debugger couldn't follow the stack between methods. So: is there an 'asynchronously-aware' debugger for node.js? There should be! – Tony Eastwood Nov 06 '13 at 14:08
  • 1
    Refering to your sentence `asynchronous really only applies to I/O operations`: aren't number crunching (brute forcing, zipping, ...) and rendering also applications for asynchrony? These are mostly computation based compared to I/O operations, but still consume more time than they should for each event-tick/-loop. – kernel Jun 24 '14 at 13:49
  • 1
    "Asynchronous really only applies to I/O operations" means that these are the only things the Node Developers have made Async. NOT that they're the only things that are useful to have async. Also, spawning a separate process for number crunching in node to force something to be ASYNC is trivial. Forcing something to be truly synchronous, is not. – MobA11y Jun 24 '14 at 13:52
  • ALso, comparing the value of NOT having to wait for network connection for I/O vs NOT waiting for RAM in complex math computation... not really valid. Unless you're doing experiments with recursive Fibonacci or the end of the world puzzle or something, in which case... for the Love of God, use a different language. – MobA11y Jun 24 '14 at 13:58
  • couldn't we write a native c++ module that delivers this generally? – Mark Essel Oct 18 '16 at 15:59
  • 1
    "... and you could consider native extensions..." so yes. – MobA11y Oct 18 '16 at 16:03