13

A couple of years ago I started writing an interpreter for a little Domain Specific Language which included programmer-defined functions.

At first I implemented variable scope using a simple stack of symbol-tables. But now I want to move to proper lexical scoping (with the option of closures). Can anyone explain the data-structure and algorithm behind lexical scope?

Kevin Reid
  • 37,492
  • 13
  • 80
  • 108
interstar
  • 26,048
  • 36
  • 112
  • 180

5 Answers5

12

To get correct lexical scoping and closures in an interpreter, all you need to do is follow these rules:

  • In your interpreter, variables are always looked up in an environment table passed in by the caller or kept as a variable, not some global env-stack. The signature of your eval operation is like eval(expression, env) => value.
  • When interpreted code calls a function, the environment is NOT passed to that function. The signature of your function application operation is like apply(function, arguments) => value.
  • When an interpreted function is called, the environment its body is evaluated in is the environment in which the function definition was made, and has nothing whatsoever to do with the caller. So if you have a local function, then it is a closure, that is, a data structure containing fields {function definition, env-at-definition-time}.

To expand on that last bit in Python-ish syntax:

x = 1
return lambda y: x + y

should be executed as if it were

x = 1
return make_closure(<AST for "lambda y: x + y">, {"x": x})

where the second dict argument may be just the current-env rather than a data structure constructed at that time. (On the other hand, retaining the entire env rather than just the closed-over variables can mean programs have surprising memory leaks because closures are holding onto things the don't need. This is worth fixing in any 'practical' language implementation but not when you are just experimenting with language semantics.)

Kevin Reid
  • 37,492
  • 13
  • 80
  • 108
7

There are many different ways to implement lexical scoping. Here are some of my favorites:

  • If you don't need super-fast performance, use a purely functional data structure to implement your symbol tables, and represent a nested function by a pair containing a pointer to the code and a pointer to the symbol table.

  • If you need native-code speeds, my favorite technique is described in Making a Fast Curry by Simon Marlow and Simon Peyton Jones.

  • If you need native-code speeds, but curried functions are not that important, consider closure-passing style.

Norman Ramsey
  • 198,648
  • 61
  • 360
  • 533
  • The paper about currying appears to be inappropriately informal sometimes. For instance, see the 2nd sentence of the last paragraph of the 2nd section: __"The call is to a *known function* — one whose definition the compiler can “see” "__. By "see", they mean that the function appears in the lexical scope where it is used. Otherwise, I guess it's a good read if you're like me trying to figure out how to make an interpreter. – ljleb May 06 '20 at 06:21
3

Read The implementation of Lua 5.0 for instance.

lhf
  • 70,581
  • 9
  • 108
  • 149
1

There is no single right way to do this. The important thing is to clearly state the semantics that you are looking to provide, and then the data structures and algorithms will follow.

bmargulies
  • 97,814
  • 39
  • 186
  • 310
  • 2
    Sure. I can always try to derive the whole thing myself. :-) But for many well understood programming tasks, there are usually existing solutions that are already known and widely taught and adopted, no? – interstar Mar 05 '10 at 03:32
  • 1
    The book referenced in the comment to your question, or the famous book with the dragon on the cover, will take care of that. – bmargulies Mar 05 '10 at 12:31
1

Stroustrup implemented this in the first C++ compiler simply with one symbol table per scope, and a chaining rule that followed scopes outwards until a definition is found. How this works exactly depends on your precise semantics. Make sure you nail those down first.

Knuth in The Art of Computer Programming, Vol 1, gives an algorithm for a Cobol symbol table whereby scoping is done via links.

user207421
  • 305,947
  • 44
  • 307
  • 483