5

I am trying to create a function "add" that can be applied to a single argument, and subsequently another. I can't figure out how to represent this with LLVM IR, as I don't understand how to call a function with a single value, then save the value somewhere in memory and return another function that is applied to that val. I would need some sort of closure mechanism in LLVM.

I have searched for implementations of this in C so that I could view the emitted LLVM through clang, but the solutions I found were wildly complex, so I thought I might just investigate LLVM directly.

This would be the uncurried version

define i8 @add(i8 %a, i8 %b) {
entry:
  %res = add i8 %a, %b
  ret i8 %res
}

And somehow I'd like for add(1) to return an i8 (i8) type. I figure I'll have to split the function up somehow.

ps. I am looking into this because I'm working on a compiler for a small functional language, so I'm looking for any advice concering the implementation of partial application/currying in compiler design in general.

Update: I now have the following code working, but it's a quite complicated and I don't think it'll be easy to generate automatically

declare i32 @printf(i8* noalias nocapture, ...)

define { i8, i8 (i8, i8) * } @add1(i8 %a) {
  ; allocate the struct containing the supplied argument 
  ; and a function ptr to the actual function
  %nextPtr = alloca { i8, i8 (i8, i8) * }
  store { i8, i8 (i8, i8) * } { i8 undef, i8 (i8, i8) * @add2 }, { i8, i8 (i8, i8) * } * %nextPtr
  %next0 = load { i8, i8 (i8, i8) * } * %nextPtr

  ; insert the supplied arg into the struct
  %next1 = insertvalue { i8, i8 (i8, i8) * } %next0, i8 %a, 0
  ret { i8, i8 (i8, i8) * } %next1
}

define i8 @add2(i8 %a, i8 %b) {
  %res = add i8 %a, %b
  ret i8 %res
}

define i8 @main() {
  ; call add(35) resulting in 'fn' of type {35, &add2}
  %res1 = call { i8, i8 (i8, i8) * } @add1(i8 35)

  ; get the arg of the first call, ie element 0 of the resulting struct
  %arg = extractvalue { i8, i8 (i8, i8) * } %res1, 0
  ; similarily get the function ptr
  %fn = extractvalue { i8, i8 (i8, i8) * } %res1, 1

  ; apply the argument to the function
  %res = call i8 %fn(i8 %arg, i8 30)

  ; print result  
  %ptr = alloca i8
  store i8 %res, i8* %ptr
  call i32 (i8*, ...)* @printf(i8* %ptr)

  ret i8 0
}
altschuler
  • 3,694
  • 2
  • 28
  • 55
  • 2
    Are you set on the `i8 (i8)` type or was that an example? To actually get a honest-to-LLVM function pointer you'd have to dynamically generate machine code at run time. It's easier if you create an aggregate that stores already-supplied arguments together with a function pointers. AFAIK that's how Rust implements closures. I'd tell you to look at their output, but in my limited experience that IR is also rather hard to read. –  Apr 12 '14 at 18:24
  • That worked (see above), but would be much easier if I could get the `i8 (i8)` type back from the first call :D – altschuler Apr 12 '14 at 19:36
  • 2
    I really doubt it would be easier. As I said, to get an actual function pointer, you'd need to generate thunks for each partial application. That entails generating code to (1) allocate memory, (2) set permissions accordingly, (3) write out machine code that's not only platform-specific but also argument-type-specific, and possibly some more. Even if you use libffi for that, you still need to communicate with libffi and embed all that into LLVM IR. Mechanically generating fat closure pointers won't be trivial, but should be feasible. –  Apr 12 '14 at 19:42
  • That makes sense, I'll look into using this method. Thanks – altschuler Apr 12 '14 at 19:59
  • I imagine you've solved this in the meantime, just a note to say I just completed this in a clean room and see that I ended up with what @delnan indicated this is how Rust works. Let me know, happy to share. – Frank C. Sep 07 '17 at 18:53
  • @FrankC. I've left this project long ago, but I'd love to see it your work! – altschuler Sep 07 '17 at 22:02
  • 1
    OK, I'll probably setup a gist to go through it with pseudo code or something. It is a combo of compile time constructs along with RTL support but does exploit the notion of a function reference that has storage for arguments which will keep accumulating to the function signature threshold before invoking the function. This is how I get the curry/partial capability. As a side note I use a slight variation for lambda/closures. – Frank C. Sep 07 '17 at 22:21

1 Answers1

4

I've written this example C code to emulate what my functional language compiler supports for 'partial application' (lambda calculus). I don't emit 'C' code but, instead, directly emit to LLVM-IR. You can see the LLVM-IR perspective by just emitting from the sample source:

clang -S -emit-llvm partial.c

The compiler is triggered to emit the partial machinery in LLVM-IR when it passes over the AST node reflecting a parenthetically wrapped (give or take a few additional details) expression.

Hope this helps.

Frank C.
  • 7,758
  • 4
  • 35
  • 45