How to efficiently implement closures in LLVM IR?

Question

I started adding closures (lambdas) to my language that uses LLVM as the backend. I have implemented them for simple cases where they can be always inlined i.e. code for the closure definition itself doesn't need to be generated, as it is inlined where used.

But how to generate the code for a closure in case the closure isn't always inlined (for example, it is passed to another function that isn't inlined). Preferably, the call sites shouldn't care whether they are passed regular functions or closures and would call them as normal functions.

I could generate a function with a synthetic name, but it would have to take the referencing environment as an extra argument and that function couldn't just be passed to another function that doesn't know about the needed extra argument.

I have thought of one possible solution using LLVM's trampoline intrinsics, which "excise" a single parameter from a function, returning a pointer to a trampoline function that takes one less parameter. In this case, if the function generated for the closure took the referencing environment as the first parameter, I could excise it and get back a function that takes exactly as many parameters as the closure actually declares. Does this sound doable? Efficient? Are there any better solutions?

Code example:

def applyFunctionTo(value: Int, f: (Int) -> Int) = f(value)

def main() = {
  val m := 4;
  val n := 5;
  val lambda := { (x: Int) => x + m + n };
  applyFunctionTo(3, lambda)
}

Now, lets imagine that this wouldn't get inlined to def main() = 3 + 4 + 5, and that applyFunctionTo would possibly be compiled separately, and we can't change the call site there. With trampolining, I imagine the generated code would be something like this (expressed in pseudocode, * means pointer):

def main$lambda$1(env: {m: Int, n: Int}*, x: Int) = x + env.m + env.n
def main() = {
  m = 4
  n = 5
  env* = allocate-space-for {Int, Int}
  env = {m, n}
  tramp* = create-trampoline-for(main$lambda$1*, env*)
  return applyFunctionTo(3, tramp*)
  // release memory for env and trampoline if the lambda didn't escape
}

Does this seem right?

There is no difference between implementing closures and implementing objects with virtual methods. — SK-logic, Jan 03 '12 at 11:50
It's possible that you are right, however, the language will not have virtual methods (yet). At least it will have closures and a lot of other stuff before that. I might add some features in a dumb order because I'm just doing it for learning purposes, mostly. I only hope that something useful comes from it eventually. — Erkki Lindpere, Jan 03 '12 at 13:18
I meant there is no reason to invent anything new for closures: you can just do the same thing as, say, a C++ compiler is already doing. Chances are that it is already the most efficient thing to do. — SK-logic, Jan 03 '12 at 13:27
Good point, but a C++ compiler is probably complicated enough that it might not be the best one to learn from. For example I don't have inheritance or even any kind of polymorphism yet, so my functions and methods are all just functions in LLVM and all function/method calls are pretty simple. — Erkki Lindpere, Jan 03 '12 at 16:02
ok, then emitting a structure with the captured environment and a single function pointer (as the first element) should be sufficient for your lambda lifting implementation. This way you can bitcast your closure to a function pointer and pass it as the first argument to a call. — SK-logic, Jan 03 '12 at 16:08
Sorry, I don't get your meaning exactly. I added a code example above. There would be no lambda lifting, because I can't modify the call site (but maybe I misunderstand what you meant or miscommunicated in the original question). If the function pointer is in the environment, how does that help to get the environment from the function? Also maybe I wasn't clear before that I'd strongly prefer that call sites didn't change depending on whether a lambda or regular function is passed to them. And regular functions should be preferred. — Erkki Lindpere, Jan 04 '12 at 00:06
Ah, nevermind. I think I got what you meant now. It's similar to what the other answer suggested as an alternative. I think I'll give the trampolines a try, though. — Erkki Lindpere, Jan 04 '12 at 00:18

score 8 · Accepted Answer · answered Jan 03 '12 at 10:40

8

Sounds doable and efficient.

The alternative way, that does not need trampolines, is to define closure type as a pair of function pointer and pointer to environment ie stack pointer. In C calling convention the extra arguments are ignored so if you provide environment as last argument you can even use (function_ptr, null) as callback for regular function.

answered Jan 03 '12 at 10:40

zch

14,931
2
41
49

I'll try the trampolines for now, but it seems that the alternative of passing functions as a pair of pointers might actually be better in case the majority of functions passed around in programs are either closures that actually have free variables or methods of objects (where 'this' can be considered to be a free variable). I'm not sure how exactly the language will turn out, but I might consider switching to that representation later. – Erkki Lindpere Jan 04 '12 at 12:36

score 1 · Answer 2 · answered Jan 09 '12 at 16:27

A dumb idea would be that for each closure you generate a thread local structure to hold the required data (could be just a pointer to a local structure, or several pointers).

The creator of the closure is the responsible for setting the TLS variables and "saving" the state they had (to allow recursive call).

The user then calls the function normally, it's executed and use the environemnt.

After the call, the creator of the closure "restores" the original values into the TLS variables.

How to efficiently implement closures in LLVM IR?

2 Answers2

Linked