8

I am new to LLVM compiler and infrastructure. I have the following thought. Clang is the LLVM front end for C/C++, similarly Rustc for Rust programming language. Both can emit the LLVM IR code and the emitted code can be compiled to executable application.

My question is is it possible to link different programming languages? Example shown below -

/* Code in C */
int add(int, int);
int main()
{
  printf("%d", add(5 ,6));
}

The function defined in Rust for example

// Code in Rust
fn main()
{
  println!("{}", add(5, 6));
}

fn add (x: i32, y: i32) -> i32
{
  x + y
}

Once the IR is generated from both the source files, is it possible to link them and create a single application?

I am just curious to know if this works, please let me know.

mcarton
  • 27,633
  • 5
  • 85
  • 95
Bharadwaj
  • 737
  • 6
  • 26

2 Answers2

8

Short answer: Yes.


Long answer: Yes, as long as some requirements are fulfilled.

There are two kinds of compatibility: API (Application Program Interface) and ABI (Application Binary Interface). Essentially, the API dictates whether your program compiles whereas the ABI dictates whether it links, loads and runs.

Since Rust has a C FFI, Rust can emit code that can normally interact with C (it has the proper C ABI, for the considered platform). This is evident in that a Rust binary can call a C library.

If you take the LLVM IR of that Rust binary, the LLVM IR of that C library, merge both together, and use LLVM to produce a new binary, then you'll get a single binary (no dependency).

So, the "only" requirement is that your two pieces of code must be able to link/load/run independently first.


Another way to obtain a single binary, which is independent from LLVM, is static linking; in Rust you can for example static link with the musl implementation of the C standard library. The main advantage of merging at LLVM IR, is that you can then run LLVM optimization passes on the merged IR, and therefore benefit from cross-language inlining (and other optimizations).

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • I think the long answer should be sometimes, not necessarily yes because an ABI is required in order to have interop. An ABI certainly requires some work to implement so you can't simply link two different language's IR outputs. Other than that I think you gave a pretty good answer to the question. – Bennet Leff Jul 07 '16 at 15:22
  • @BennetLeff: That's exactly that the long answer actually says: if both pieces of code can load/run when in separate libraries, then you can merge their IRs and have them run (since you validated the ABI). Or do you mean that I should edit something because it's not too clear? – Matthieu M. Jul 07 '16 at 15:24
  • I think your answer can remain as is. However, to me it would be more clear if instead of "Long answer: yes..." it said "Long answer: sometimes..." – Bennet Leff Jul 07 '16 at 15:37
  • @BennetLeff: Updated; I kept the yes to make it clear it is possible, but immediately qualified that it was not automatic. – Matthieu M. Jul 07 '16 at 16:18
  • @Matthieu M. But for the languages that do not have FFI, Say linking Go and Rust, both have LLVM front end, then? Also if possible, can you please show me the above example working? – Bharadwaj Jul 08 '16 at 09:17
  • 1
    @Bharadwaj: If the languages do not have FFI, then no it is not possible. In essence, merging at LLVM IR level is only an optimization compared to building a static library/binary from multiple static libraries coming from different languages. – Matthieu M. Jul 08 '16 at 09:22
  • @MatthieuM.: Thank you for putting it this way. This is correct and concise. – Bennet Leff Jul 08 '16 at 17:13
3

Firstly, Rust and C can talk but through Rust's FFI (Foreign Function Interface). For very basic functions, I imagine it would be possible to compile both languages to LLVM and have some sort of functionality but we're talking hello world length programs (maybe even not at that level though). In general there must be some sort of ABI to implement what you're suggesting. However, even with an ABI the implementation is done at the Front End level.

Concisely put, LLVM can't represent all language specific constructs. So you can't just link two program's LLVM IR and hope it works. There must be some work done at the front end to ensure compatibility between two languages.

Bennet Leff
  • 376
  • 1
  • 4
  • 18