6

Consider the following example:

function makeFunction() {
  let x = 3;
  let s = "giant string, 100 MB in size";

  return () => { console.log(x); };
}

// Are both x and s held in memory here
// or only x, because only x was referred to by the closure returned
// from makeFunction?
let made = makeFunction();

// Suppose there are no further usages of makeFunction after this point

// Let's assume there's a thorough GC run here

// Is s from makeFunction still around here, even though made doesn't use it?
made();

So if I close around just one variable from a parent lexical environment, is that variable kept around or is every sibling variable in its lexical environment also kept around?

Also, what if makeFunction was itself nested inside another outer function, would that outer lexical environment be retained even though neither makeFunction nor makeFunction's return value referred to anything in that outer lexical environment?

I'm asking for performance reasons - do closures keep a bunch of stuff around or only what they directly refer to? This impacts memory usage and also resource usage (e.g. open connections, handles, etc.).

This would be mostly in a NodeJS context, but could also apply in the browser.

Joel
  • 2,285
  • 2
  • 21
  • 22
  • Certainly if you use `eval` or some fancy `arguments` invocation, then the entire lexical scope has to be retained, so the closures cannot by default keep only references to the stuff that is referred to explicitly. Also if you access any non-constant globals, those globals might be `eval` (or change to `eval`) later. However, it is possible that individual engines might optimize away the references in certain cases (if no funny stuff is used). I would also be interested in knowing the answer to this. – Fengyang Wang Dec 11 '19 at 01:35
  • This depends on the browser and implementation. Afaik most modern browsers try to not keep it around. But there is no guarantee. – Lux Dec 11 '19 at 01:36
  • What performance reasons do you have for asking? After the function is executed, all unreferenced memory internal to that scope should be released. Garbage collection is an internal feature of JavaScript; you don't really have to think about it when developing in this language. You can read about memory management here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Memory_Management – Jacob Penney Dec 11 '19 at 01:57
  • Try using `console.dir(made);` and drill into the `[[Closure]]` property to see what it's retaining. – Barmar Dec 11 '19 at 03:17
  • Have a look at https://stackoverflow.com/questions/28388530/why-does-chrome-debugger-think-closed-local-variable-is-undefined?noredirect=1&lq=1 – Bergi Dec 11 '19 at 10:52

1 Answers1

4

V8 developer here. This is a bit complicated ;-)

The short answer is: closures only keep around what they need.

So in your example, after makeFunction has run, the string referred to by s will be eligible for garbage collection. Due to how garbage collection works, it's impossible to predict when exactly it'll be freed; "at the next garbage collection cycle". Whether makeFunction runs again doesn't matter; if it does run again, a new string will be allocated (assuming it was dynamically computed; if it's a literal in the source then it's cached). Whether made has already run or will run again doesn't matter either; what matters is that you have a variable referring to it so you could run it (again). Engines generally can't predict which functions will or won't be executed in the future.

The longer answer is that there are some footnotes. For one thing, as comments already pointed out, if your closure uses eval, then everything has to be kept around, because whatever source snippet is eval'ed could refer to any variable. (What one comment mentioned about global variables that could be referring to eval is not true though; there is a semantic difference for "global eval", a.k.a. "indirect eval": it cannot see local variables. Which is usually considered an advantage for both performance and debuggability -- but even better is to not use eval at all.)

The other footnote is that somewhat unfortunately, the tracking is not as fine-grained as it could be: each closure will keep around what any closure needs. We have tried fixing this, but as it turns out finer-grained tracking causes more memory consumption (for metadata) and CPU consumption (for doing the work) and is therefore usually not worth it for real code (although it can have massive impact on artificial tests stressing precisely this scenario). To give an example:

function makeFunction() {
  let x = 3;
  let s = "giant string, 100 MB in size";
  let short_lived = function() { console.log(s.length); }
  // short_lived();  // Call this or don't, doesn't matter.
  return function long_lived() { console.log(x); };
}

let long_lived = makeFunction();

With this modified example, even though long_lived only uses x, short_lived does use s (even if it's never called!), and there is only one bucket for "local variables from makeFunction that are needed by some closure", so that bucket keeps both x and s alive. But as I said earlier: real code rarely runs into this issue, so this is usually not something you have to worry about.

Side note:

and also resource usage (e.g. open connections, handles, etc.)

As a very general statement (i.e., in any language or runtime environment, regardless of closures or whatnot), it's usually advisable not to rely on garbage collection for resource management. I recommend to free your resources manually and explicitly as soon as it is appropriate to free them.

jmrk
  • 34,271
  • 7
  • 59
  • 74
  • "*Engines generally can't predict which functions will or won't be executed in the future*" - except for functions not referenced from anywhere, those are known not to be executed again :-) – Bergi Dec 11 '19 at 10:56
  • @Bergi: sure, functions (just like any other objects) with zero references can be found by the garbage collector and freed; however that doesn't help with the OP's question because the decision whether a given variable needs to be context-allocated is made at parse time, and the parser by definition has an extremely shallow understanding of program semantics -- in particular, it doesn't know which things are referenced elsewhere and which things aren't. – jmrk Dec 11 '19 at 14:24