16

I understand that a closure is defined as:

[A] stack-frame which is not deallocated when the function returns. (as if a 'stack-frame' were malloc'ed instead of being on the stack!)

But I do not understand how this answer fits in the context of JavaScript's storage mechanism. How does the interpreter keep track of these values? Is the browser's storage mechanism segmented in a way similar to the Heap and Stack?

An answer on this question: How do JavaScript closures work? Explains that:

[A] function reference also has a secret reference to the closure

What is the underlying mechanism behind this mysterious "secret reference?"

EDIT Many have said that this is implementation dependent, so for the sake of simplicity, please provide an explanation in the context of a particular implementation.

Community
  • 1
  • 1
ajspencer
  • 1,017
  • 1
  • 10
  • 21
  • 3
    I don't think there's anything mysterious. ECMA-262 describes certain behaviour, implementations are free to implement that behaviour any way they wish. When a function is created (i.e. when entering an execution context) a scope chain is created (essentially a stack of execution contexts). This is when closures are created. If a function is returned, it keeps its scope chain and context. That is when closures become interesting and useful. – RobG Jul 30 '15 at 22:35
  • 2
    It's implementation dependent and therefore irrelevant for the average JavaScript programmer. Nevertheless, one popular way of implementing closures is [lambda lifting](https://en.wikipedia.org/wiki/Lambda_lifting). – Aadit M Shah Jul 30 '15 at 22:38
  • https://mitpress.mit.edu/sicp/ This book uses Scheme as the demonstration language, but the design concepts are similarl. – Barmar Jul 30 '15 at 23:07
  • 1
    Very interesting question, do NOT close please, or redirect to the proper site :-| –  Jul 31 '15 at 06:50
  • Since many people have expressed interest in the question and I edited the details of the question to make it more specific, could people please vote to reopen it? – ajspencer Aug 01 '15 at 17:06
  • http://mrale.ph/blog/2012/09/23/grokking-v8-closures-for-fun.html – Bergi Aug 01 '15 at 21:54
  • The most detailed explanation I've written on this topic can be found here: http://stackoverflow.com/questions/26061856/javascript-cant-access-private-properties/26063201#26063201. Admittedly, this question is where that answer belongs so I'm very, very tempted to close this question as duplicate of that in order to point it to that answer. But it's a different question, so I'll leave it to others to decide if it's appropriate. – slebetman Aug 03 '15 at 02:44
  • @slebetman thanks for linking that answer! It was very helpful, but I agree the question is different, while the answer may be the same. Someone wanting to learn about closures in this way would likely not stumble upon that question. – ajspencer Aug 03 '15 at 03:14

3 Answers3

14

This is a section of slebetman's answer to the question javascript can't access private properties that answers your question very well.

The Stack:

A scope is related to the stack frame (in Computer Science it's called the "activation record" but most developers familiar with C or assembly know it better as stack frame). A scope is to a stack frame what a class is to an object. By that I mean that where an object is an instance of a class, a stack frame is an instance of scope.

Let's use a made-up language as an example. In this language, like in javascript, functions define scope. Lets take a look at an example code:

var global_var

function b {
    var bb
}

function a {
    var aa
    b();
}

When we read the code above, we say that the variable aa is in scope in function a and the variable bb is in scope in function b. Note that we don't call this thing private variables. Because the opposite of private variables are public variables and both refer to properties bound to objects. Instead we call aa and bb local variables. The opposite of local variables are global variables (not public variables).

Now, let's see what happens when we call a:

a() gets called, create a new stack frame. Allocate space for local variables on the stack:

The stack:
 ┌────────┐
 │ var aa │ <── a's stack frame
 ╞════════╡
 ┆        ┆ <── caller's stack frame

a() calls b(), create a new stack frame. Allocate space for local variables on the stack:

The stack:
 ┌────────┐
 │ var bb │ <── b's stack frame
 ╞════════╡
 │ var aa │
 ╞════════╡
 ┆        ┆

In most programming languages, and this includes javascript, a function only has access to its own stack frame. Thus a() cannot access local variables in b() and neither can any other function or code in global scope access variables in a(). The only exception are variables in global scope. From an implementation point of view this is achieved by allocating global variables in an area of memory that does not belong to the stack. This is generally called the heap. So to complete the picture the memory at this point looks like this:

The stack:     The heap:
 ┌────────┐   ┌────────────┐
 │ var bb │   │ global_var │
 ╞════════╡   │            │
 │ var aa │   └────────────┘
 ╞════════╡
 ┆        ┆

(as a side note, you can also allocate variables on the heap inside functions using malloc() or new)

Now b() completes and returns, it's stack frame is removed from the stack:

The stack:     The heap:
 ┌────────┐   ┌────────────┐
 │ var aa │   │ global_var │
 ╞════════╡   │            │
 ┆        ┆   └────────────┘

and when a() completes the same happens to its stack frame. This is how local variables gets allocated and freed automatically - via pushing and popping objects off the stack.

Closures:

A closure is a more advanced stack frame. But whereas normal stack frames gets deleted once a function returns, a language with closures will merely unlink the stack frame (or just the objects it contains) from the stack while keeping a reference to the stack frame for as long as it's required.

Now let's look at an example code of a language with closures:

function b {
    var bb
    return function {
        var cc
    }
}

function a {
    var aa
    return b()
}

Now let's see what happens if we do this:

var c = a()

First function a() is called which in turn calls b(). Stack frames are created and pushed onto the stack:

The stack:
 ┌────────┐
 │ var bb │
 ╞════════╡
 │ var aa │
 ╞════════╡
 │ var c  │
 ┆        ┆

Function b() returns, so it's stack frame is popped off the stack. But, function b() returns an anonymous function which captures bb in a closure. So we pop off the stack frame but don't delete it from memory (until all references to it has been completely garbage collected):

The stack:             somewhere in RAM:
 ┌────────┐           ┌╶╶╶╶╶╶╶╶╶┐
 │ var aa │           ┆ var bb  ┆
 ╞════════╡           └╶╶╶╶╶╶╶╶╶┘
 │ var c  │
 ┆        ┆

a() now returns the function to c. So the stack frame of the call to b() gets linked to the variable c. Note that it's the stack frame that gets linked, not the scope. It's kind of like if you create objects from a class it's the objects that gets assigned to variables, not the class:

The stack:             somewhere in RAM:
 ┌────────┐           ┌╶╶╶╶╶╶╶╶╶┐
 │ var c╶╶├╶╶╶╶╶╶╶╶╶╶╶┆ var bb  ┆
 ╞════════╡           └╶╶╶╶╶╶╶╶╶┘
 ┆        ┆

Also note that since we haven't actually called the function c(), the variable cc is not yet allocated anywhere in memory. It's currently only a scope, not yet a stack frame until we call c().

Now what happens when we call c()? A stack frame for c() is created as normal. But this time there is a difference:

The stack:
 ┌────────┬──────────┐
 │ var cc    var bb  │  <──── attached closure
 ╞════════╤──────────┘
 │ var c  │
 ┆        ┆

The stack frame of b() is attached to the stack frame of c(). So from the point of view of function c() it's stack also contains all the variables that were created when function b() was called (Note again, not the variables in function b() but the variables created when function b() was called - in other words, not the scope of b() but the stack frame created when calling b(). The implication is that there is only one possible function b() but many calls to b() creating many stack frames).

But the rules of local and global variables still applies. All variables in b() become local variables to c() and nothing else. The function that called c() has no access to them.

What this means is that when you redefine c in the caller's scope like this:

var c = function {/* new function */}

this happens:

                     somewhere in RAM:
                           ┌╶╶╶╶╶╶╶╶╶┐
                           ┆ var bb  ┆
                           └╶╶╶╶╶╶╶╶╶┘
The stack:
 ┌────────┐           ┌╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶┐
 │ var c╶╶├╶╶╶╶╶╶╶╶╶╶╶┆ /* new function */ ┆
 ╞════════╡           └╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶╶┘
 ┆        ┆

As you can see, it's impossible to regain access to the stack frame from the call to b() since the scope that c belongs to doesn't have access to it.

Community
  • 1
  • 1
ajspencer
  • 1,017
  • 1
  • 10
  • 21
5

I've written an article on this topic: How do JavaScript closures work under the hood: the illustrated explanation.

To understand the subject, we need to know how scope objects (or LexicalEnvironments) are allocated, used and deleted. This understanding is a key to having a big picture and to know how do closures work under the hood.

I'm not going to re-type the whole article here, but as a short example, consider this script:

"use strict";

var foo = 1;
var bar = 2;

function myFunc() {
  //-- define local-to-function variables
  var a = 1;
  var b = 2;
  var foo = 3;
}

//-- and then, call it:
myFunc();

When executing the top-level code, we have the following arrangement of scope objects:

enter image description here

Notice that myFunc references both:

  • Function object (which contains code and any other publicly-available properties)
  • Scope object, which was active by the time function is defined.

And when myFunc() is called, we have the following scope chain:

enter image description here

When function is called, new scope object is created and used to augment the scope chain referenced by the myFunc. It allows us to achieve very powerful effect when we define some inner function, and then call it outside of the outer function.

See the aforementioned article, it explains things in detail.

Dmitry Frank
  • 10,417
  • 10
  • 64
  • 114
  • Is the reference from `myFunc() scope` object to the `Global` object a prototype link? If not, what is that link called? Can you please clarify. Also, what are the names to the other linkages or references mentioned here? Like there's a self reference from `myFunc = ` reference pointer to the global object. What is it called? Thanks – Aryak Sengupta Apr 18 '18 at 19:17
2

Here is an example of how you can transform code that needs closures into code that doesn't. The essential points to pay attention to are: how function declarations are transformed, how function calls are transformed, and how accesses to local variables that have been moved to the heap are transformed.

Input:

var f = function (x) {
  x = x + 10
  var g = function () {
    return ++x
  }
  return g
}

var h = f(3)
console.log(h()) // 14
console.log(h()) // 15

Output:

// Header that goes at the top of the program:

// A list of environments, starting with the one
// corresponding to the innermost scope.
function Envs(car, cdr) {
  this.car = car
  this.cdr = cdr
}

Envs.prototype.get = function (k) {
    var e = this
    while (e) {
        if (e.car.get(k)) return e.car.get(k)
        e = e.cdr
    }
    // returns undefined if lookup fails
}

Envs.prototype.set = function (k, v) {
    var e = this
    while (e) {
        if (e.car.get(k)) {
            e.car.set(k, v)
            return this
        }
        e = e.cdr
    }
    throw new ReferenceError()
}

// Initialize the global scope.
var envs = new Envs(new Map(), null)

// We have to use this special function to call our closures.
function call(f, ...args) {
    return f.func(f.envs, ...args)
}

// End of header.

var f = {
    func: function (envs, x) {
        envs = new Envs(new Map().set('x',x), envs)

        envs.set('x', envs.get('x') + 10))
        var g = {
            func: function (envs) {
                envs = new Envs(new Map(), envs)
                return envs.set('x', envs.get('x') + 1).get('x')
            },
            envs: envs
        }
        return g
    },
    envs: envs
}

var h = call(f, 3)
console.log(call(h)) // 14
console.log(call(h)) // 15

Let's break down how the three key transformations go. For the function declaration case, assume for concreteness that we have a function of two arguments x and y and one local variable z, and x and z can escape the stack frame and so need to be moved to the heap. Because of hoisting we may assume that z is declared at the beginning of the function.

Input:

var f = function f(x, y) {
    var z = 7
    ...
}

Output:

var f = {
    func: function f(envs, x, y) {
        envs = new Envs(new Map().set('x',x).set('z',7), envs)
        ...
    }
    envs: envs
}

That's the tricky part. The rest of the transformation just consists in using call to call the function and replacing accesses to the variables moved to the heap with lookups in envs.

A couple of caveats.

  1. How did we know that x and z needed to be moved to the heap but not y? Answer: the simplest (but possibly not optimal) thing is to just move anything to the heap that is referenced in an enclosed function body.

  2. The implementation I have given leaks a ton of memory and requires function calls to access access local variables moved to the heap instead of inlining that. A real implementation wouldn't do these things.

Finally, user3856986 posted an answer that makes some different assumptions than mine, so let's compare it.

The main difference is that I assumed that local variables would be kept on a traditional stack, while user3856986's answer only makes sense if the stack will be implemented as some kind of structure on the heap (but he or she is not very explicit about this requirement). A heap implementation like this can work, though it will put more load on the allocator and GC since you have to allocate and collect stack frames on the heap. With modern GC technology, this can be more efficient than you might think, but I believe that the commonly used VMs do use traditional stacks.

Also, something left vague in user3856986's answer is how the closure gets a reference to the relevant stack frame. In my code, this happens when the envs property is set on the closure while that stack frame is executing.

Finally, user3856986 writes, "All variables in b() become local variables to c() and nothing else. The function that called c() has no access to them." This is a little misleading. Given a reference to the closure c, the only thing that stops one from getting access to the closed variables from the call to b is the type system. One could certainly access these variables from assembly (otherwise, how could c access them?). On the other hand, as for the true local variables of c, it doesn't even make sense to ask if you can get access to them until some particular invocation of c has been specified (and if we consider some particular call, by the time control gets back to the caller, the information stored in them might already have been destroyed).

gmr
  • 712
  • 6
  • 17
  • It would be nifty if your `g.func` would be not be defined in a closure position (to make clear that it doesn't close over anything, but receives an `env`) – Bergi Aug 02 '15 at 19:12
  • What does `Envs` stand for? – Bergi Aug 02 '15 at 19:15
  • The comment *`// If none exists, make a new one in the innermost scope.`* seems wrong, shouldn't it say "outermost"? Btw, your methods might be easier to understand if you'd write them recursively – Bergi Aug 02 '15 at 19:16
  • Good eye, yes, there was a mistake in the set function. I have fixed it now so that it really does set the variable in the innermost scope. Also, I got rid of the newScope function, which might make the code a bit clearer. I think recursion would have about the same clarity as using loops, so I'll leave that as is. "Envs" stands for "environments". What do you mean by "closure position"? – gmr Aug 02 '15 at 19:53
  • I don't think that was a mistake, as undeclared variables *do* create a binding in the global (outermost) scope - just make it throw like in strict mode :-). Ah, right, it's `Envs` not `Env` because you're made it a list of environment*s*. By "closure position" I meant that it's still placed inside `f`, while you could make a global `_g_code` value of it (closer to an implementation than the js code) – Bergi Aug 02 '15 at 20:24
  • OK, I made it throw instead of making a binding in the innermost scope, which, as you point out, is not what JavaScript does either in or out of strict mode. And yes, the tradeoff with using a global is that it could make it clearer that the code doesn't use closures, but it also makes the transformation less direct. – gmr Aug 03 '15 at 02:37