46

I have heard that google app engine can run any programming language that can be transformed to Java bytecode via it's JVM. I wondered if it would be possible to convert LLVM bytecode to Java bytecode as it would be interesting to run languages that LLVM supports in the Google App Engine JVM.

dreamcrash
  • 47,137
  • 25
  • 94
  • 117
Ben Page
  • 3,011
  • 4
  • 35
  • 37
  • AFAIK LLVM is a hardware/OS abstraction layered library rather than a byte code virtual machine. It provides some of the same advantages but need to be compiled from source for each target platform. – Peter Lawrey Feb 08 '11 at 15:09
  • 3
    @Peter: No, you can interpret it and JIT-compile it (`lli`). But yes, the instructions are way more low-level and it's not really similar to other virtual machines. –  Feb 08 '11 at 15:15
  • @Ben, please reconsider the accepted answer in light of what I mention in http://stackoverflow.com/a/13540256/304330, thanks. – Big Rich Jul 04 '13 at 15:38

4 Answers4

35

It does now appear possible to convert LLVM IR bytecode to Java bytecode, using the LLJVM interpreter.

There is an interesting Disqus comment (21/03/11) from Grzegorz of kraytracing.com which explains, along with code, how he has modified LLJVM's Java class output routine to emit non-monolithic Java classes which agree in number with the input C/C++ modules. He suggests that his technique seems to avoid the excessively long 'compound' Java Constructor method argument signatures usually generated by LLJVM, and he provides links to his modifications and examples.

Although LLJVM doesn't look like it's been in active development for a couple of years now, its still hosted on Github and some documentation can still be found at its former repository at GoogleCode:

LLJVM @ Github
LLJVM documentation @ GoogleCode

I also came across the 'Proteuscc' project which also utilises LLVM to output Java Byte code (it suggests that this is specifically for C/C++, although I assume the project could be modified or fed LLVM Intermediate Representation (IR)). From http://proteuscc.sourceforge.net:

The general process of producing a Java executable with Proteus then can be summarised as below.

  1. Generate human readable representation of the LLVM intermediate representation (ll file)
  2. Pass this ll file as an argument to the proteus compilation system
  3. The above will produce a Java jar file which can be executed or used as a library

I've extended a bash script to compile the latest versions of LLVM and Clang on Ubuntu, it can found be as a Github Gist,here.

[UPDATE 31/03/14] - LLJVM has seemed to have been dead for somewhile, however Howard Chu (https://github.com/hyc) looks to have made LLJVM compatible with the latest version of LLVM (3.3). See Howard's LLJVM-LLVM3.3 branch at Github, here

Big Rich
  • 5,864
  • 1
  • 40
  • 64
8

I doubt you can, at least not without significant effort and run-time abstractions (e.g. building half a Von Neumann machine to execute certain opcodes). LLVM bitcode allows the full range of low-level unsafe "do what you want but we won't clean up the mess" features, from direct, raw, constructor-free memory allocation up to completely unchecked casts - real casts, not conversions -you can take i32 and bitcast it to to a %stuff * if you wish. Also, JVMs are heavily geared towards objects and methods, while the LLVM guys are lucky they have function pointers and structs.

On the other hand, it seems that C can be compiled to Java bytecode and LLVM bitcode can be compiled to Javascript (although many features, e.g. dynamic loading and stdlib functions, are lacking), so it should be possible, given enough effort.

Community
  • 1
  • 1
  • 1
    So basically LLVM bitcode is far closer to assembly than Java Bytecode so I would have to somehow 'reclaim' all the information 'lost' when a program is converted to the lower-level representation if I wanted to run it in a JVM. Which I guess is pretty impossible. – Ben Page Feb 08 '11 at 15:37
  • 2
    @Ben: Yes, it's pretty much portable (well, kind of) assembly... in an even more low-level fashion than C. Not only you'd have to do quite a lot of work when reverse-engineering e.g. Ada code compiled with `llvm-gcc`, at least C and C++ can do many things Java bytecode simply doesn't permit (for better or worse). Likewise, LLVM permits these things but the JVM doesn't. –  Feb 08 '11 at 15:42
  • 1
    The classic example I go to: `char *vga = (char *) 0xB8000`. LLVM can handle that just fine. Pretty sure JVM bytecode cannot. – Qix - MONICA WAS MISTREATED Sep 07 '16 at 09:17
  • You can actually do any raw memory operations (and raw casts) you like in Java via sun.misc.Unsafe. Any byte code calling Unsafe methods is, of course, accessing the raw menory via native code (JNI), so any LLVM constructs that were translated to byte code which calls Unsafe methods are essentially doing the memory operations via C. It would extremely clunky doing some things via Unsafe. But one can imagine a more extensive library of functions than sun.misc.Unsafe specifically designed for supporting LLVM (or other C-like) memory ops, also implemented by cross-platform native calls. – barneypitt Oct 05 '22 at 05:16
6

Late to the discussion: Sulong executes LLVM IR on the JVM. It creates executable nodes (which are Java objects) from the LLVM IR instead of converting the LLVM IR to Java bytecode. These executable nodes form an AST interpreter. You can check out the project at https://github.com/graalvm/sulong or read a paper about it at http://dl.acm.org/citation.cfm?id=2998416. Disclaimer: I'm working on this project.

box
  • 3,156
  • 3
  • 26
  • 36
0

Read this: http://vmkit.llvm.org/. I am not sure that it will help you but it seems to be relevant.

Note: This project is not more maintained.

adelarsq
  • 3,718
  • 4
  • 37
  • 47
AlexR
  • 114,158
  • 16
  • 130
  • 208
  • 6
    It's the reverse (allows building LLVM-based VMs that run e.g.Java/JVM languages on LLVM; OP wants to run LLVM languages on the JVM). –  Feb 08 '11 at 15:43
  • Fwiw, following that link: "The VMKit project is retired." – michael Oct 07 '15 at 22:43