32

I was curious about how the node.js pattern of nested functions works with the garbage collector of v8. here's a simple example

readfile("blah", function(str) {
   var val = getvaluefromstr(str);
   function restofprogram(val2) { ... } (val)
})

if restofprogram is long-running, doesn't that mean that str will never get garbage collected? My understanding is that with node you end up with nested functions a lot. Does this get garbage collected if restofprogram was declared outside, so str could not be in scope? Is this a recommended practice?

EDIT I didn't intend to make the problem complicated. That was just carelessness, so I've modified it.

nakhodkin
  • 1,327
  • 1
  • 17
  • 27
Vishnu
  • 521
  • 1
  • 6
  • 12
  • I believe V8's garbage collection is pretty smart. And that if you null it as extra measure it will get GBed?? – Alfred Mar 16 '11 at 15:43
  • I too hope that nulling it should collect it. However, the symbol will probably still occupy space in the symbol table. – dhruvbird Mar 16 '11 at 16:08
  • Related questions with good answers: [How are closures and scopes represented at run time in JavaScript](http://stackoverflow.com/questions/5368048/how-are-closures-and-scopes-represented-at-run-time-in-javascript) (with better code example), [About closure, LexicalEnvironment and GC](http://stackoverflow.com/questions/8665781/about-closure-lexicalenvironment-and-gc) (with nice scope inspector screenshots) – Bergi Sep 19 '13 at 00:31

3 Answers3

73

Simple answer: if value of the str is not referenced from anywhere else (and str itself is not referenced from restofprogram) it will become unreachable as soon as the function (str) { ... } returns.

Details: V8 compiler distinguishes real local variables from so called context variables captured by a closure, shadowed by a with-statement or an eval invocation.

Local variables live on the stack and disappear as soon as function execution completes.

Context variables live in a heap allocated context structure. They disappear when the context structure dies. Important thing to note here is that context variables from the same scope live in the same structure. Let me illustrate it with an example code:

function outer () {
  var x; // real local variable
  var y; // context variable, referenced by inner1
  var z; // context variable, referenced by inner2

  function inner1 () {
    // references context 
    use(y);
  }

  function inner2 () {
    // references context 
    use(z);
  }

  function inner3 () { /* I am empty but I still capture context implicitly */ } 

  return [inner1, inner2, inner3];
}

In this example variable x will disappear as soon as outer returns but variables y and z will disappear only when both inner1, inner2 and inner3 die. This happens because y and z are allocated in the same context structure and all three closures implicitly reference this context structure (even inner3 which does not use it explicitly).

Situation gets even more complicated when you start using with-statement, try/catch-statement which on V8 contains an implicit with-statement inside catch clause or global eval.

function complication () {
  var x; // context variable

  function inner () { /* I am empty but I still capture context implicitly */ }

  try { } catch (e) { /* contains implicit with-statement */ }

  return inner;
}

In this example x will disappear only when inner dies. Because:

  • try/catch-contains implicit with-statement in catch clause
  • V8 assumes that any with-statement shadows all the locals

This forces x to become a context variable and inner captures the context so x exists until inner dies.

In general if you want to be sure that given variable does not retain some object for longer than really needed you can easily destroy this link by assigning null to that variable.

Vyacheslav Egorov
  • 10,302
  • 2
  • 43
  • 45
  • Could anyone comment on whether the part about `try/catch` containing an implicit `with`-statement is still relevant in 2023? The current version of the spec tells that [the `catchEnv` should be a new declarative environment record](https://tc39.es/ecma262/#sec-runtime-semantics-catchclauseevaluation), it tells nothing about [object environments as in `with`](https://tc39.es/ecma262/#prod-WithStatement). Was it just some strange temporary workaround in earlier implementations of `v8`? – Andrey Tyukin Jul 13 '23 at 15:46
5

Actually your example is somewhat tricky. Was it on purpose? You seem to be masking the outer val variable with an inner lexically scoped restofprogram()'s val argument, instead of actually using it. But anyway, you're asking about str so let me ignore the trickiness of val in your example just for the sake of simplicity.

My guess would be that the str variable won't get collected before the restofprogram() function finishes, even if it doesn't use it. If the restofprogram() doesn't use str and it doesn't use eval() and new Function() then it could be safely collected but I doubt it would. This would be a tricky optimization for V8 probably not worth the trouble. If there was no eval and new Function() in the language then it would be much easier.

Now, it doesn't have to mean that it would never get collected because any event handler in a single-threaded event loop should finish almost instantly. Otherwise your whole process would be blocked and you'd have bigger problems than one useless variable in memory.

Now I wonder if you didn't mean something else than what you actually wrote in your example. The whole program in Node is just like in the browser – it just registers event callbacks that are fired asynchronously later after the main program body has already finished. Also none of the handlers are blocking so no function is actually taking any noticeable time to finish. I'm not sure if I understood what you actually meant in your question but I hope that what I've written will be helpful to understand how it all works.

Update:

After reading more info in the comments on how your program looks like I can say more.

If your program is something like:

readfile("blah", function (str) {
  var val = getvaluefromstr(str);
  // do something with val
  Server.start(function (request) {
    // do something
  });
});

Then you can also write it like this:

readfile("blah", function (str) {
  var val = getvaluefromstr(str);
  // do something with val
  Server.start(serverCallback);
});
function serverCallback(request) {
  // do something
});

It will make the str go out of scope after Server.start() is called and will eventually get collected. Also, it will make your indentation more manageable which is not to be underestimated for more complex programs.

As for the val you might make it a global variable in this case which would greatly simplify your code. Of course you don't have to, you can wrestle with closures, but in this case making val global or making it live in an outer scope common for both the readfile callback and for the serverCallback function seems like the most straightforward solution.

Remember that everywhere when you can use an anonymous function you can also use a named function, and with those you can choose in which scope do you want them to live.

rsp
  • 107,747
  • 29
  • 201
  • 177
  • yes, but if restofprogram is something like Server.start(function(request) {do something}), even though restofprogram exits instantly, the function passed to Server.start will live forever, and has str in scope. – Vishnu Mar 16 '11 at 16:07
  • Actually, the event handler could create an anonymous function which is added as an event listener to some other event and it could do this every time it is called, thus ensuring that all the scope variables (for all calls of this handler) are never collected. – dhruvbird Mar 16 '11 at 16:13
  • @dhruvbird: True. For those cases I recommend using named functions for which you can choose the scope in which they live. – rsp Mar 16 '11 at 16:43
  • @Vishnu: See the update to my answer for some ideas on how to make the cases like this more manageable. – rsp Mar 16 '11 at 16:44
  • thank you, that was the intent of my question. So unintended memory leaks are possible and using named functions when possible should alleviate the problem. – Vishnu Mar 17 '11 at 06:29
1

My guess is that str will NOT be garbage collected because it can be used by restofprogram(). Yes, and str should get GCed if restofprogram was declared outside, except, if you do something like this:

function restofprogram(val) { ... }

readfile("blah", function(str) {
  var val = getvaluefromstr(str);
  restofprogram(val, str);
});

Or if getvaluefromstr is declared as something like this:

function getvaluefromstr(str) {
  return {
    orig: str, 
    some_funky_stuff: 23
  };
}

Follow-up-question: Does v8 do just plain'ol GC or does it do a combination of GC and ref. counting (like python?)

ljk321
  • 16,242
  • 7
  • 48
  • 60
dhruvbird
  • 6,061
  • 6
  • 34
  • 39
  • Technically if the v8 GC is smart enough, it should determine if `str` is actually used (or could conceivably be used with an `eval` statement) in the body of `restofprogram`. Whether it does this or not is a question that should be asked to someone who is knowledgeable of the details of v8. – MooGoo Mar 16 '11 at 14:42
  • V8 uses a generational garbage collector. – rsp Mar 16 '11 at 14:49
  • @MooGoo I doubt any GC would be smart enough to detect "str" being used in an eval (since the string to be eval'ed could be obtained from user input) – dhruvbird Mar 16 '11 at 16:05
  • @dhruvbird if there was an `eval` or `new Function` statement in the function body, then `str` could conceivably be used, and would thus not be GC'd. If not, and there existed no direct references to `str` in the function body, then it could be GC'd. Pretty simple actually, but whether it is an efficient use of processor time is another question... – MooGoo Mar 16 '11 at 19:56
  • @MooGoo The rules for eval are quite complicated, so I don't recall them exactly, but I guess it would be possible for an outside entity to pass a handle to eval inside the function either via scope or parameters and then you could eval within the function. (Am not sure about the scoping rules of eval called with some other alias, so don't quote me on this). – dhruvbird Mar 16 '11 at 20:18
  • @MooGoo On the face of it, I think you are right, but I doubt this runtime would include optimizations of such length. – dhruvbird Mar 16 '11 at 20:26
  • This page https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/eval suggests that calling eval with another alias is an error tough it works on node.js. Furthermore, calling it with another name makes it execute in global context. I have however been able to come up with an example that would make optimization for V8 quite hard. Posting it as another comment. – dhruvbird Mar 16 '11 at 20:31
  • j=44; e = { foo: some_custom_function }; function foo() { var j = 10; eval = e.foo; eval("j=20"); }; foo(); Now, if some_custom_function is actually the global eval, then the global j would remain unchanged. However, if it is some funny function that prints "hello" and has no side effect, then node would unnecessarily keep the scope variables alive. – dhruvbird Mar 16 '11 at 20:32