82

I don't understand the difference between LLVM bitcode and Java bytecode, what are they?

-edit- by 'what are they' i mean the differences between LLVM bitcode and Java bytecode not what are LLVM and Java.

Hari
  • 1,561
  • 4
  • 17
  • 26
  • 24
    On behalf of those of us who actually understood what you asked, I would like to apologize for all the stupid answers you got. :-( – Ken Apr 17 '09 at 17:27
  • 20
    if the question is worded ambiguously, not the answers are stupid, but the question is. sorry, but whoever says the answers below are "stupid" should reread everything in this thread. when i answered your question was like "difference between llvm and java". The comment of Ken sounds quite arrogant. – Johannes Schaub - litb Apr 18 '09 at 14:23
  • 3
    Love the comment ken. Sorry AcidZombie24 for some of these answers. – user254492 Mar 01 '10 at 10:05
  • More information about LLVM bitcode is given here: https://llvm.org/docs/BitCodeFormat.html – Hari Mar 16 '23 at 13:12

4 Answers4

92

Assuming you mean JVM rather than Java:

The LLVM is a low level register-based virtual machine. It is designed to abstract the underlying hardware and draw a clean line between a compiler back-end (machine code generation) and front-end (parsing, etc.).

The JVM is a much higher level stack-based virtual machine. The JVM provides garbage collection, has the notion of objects and virtual method calls and more. Thus, the JVM provides much higher level infrastructure for language interoperability (much like Microsoft's CLR).

(It is possible to build these abstractions over LLVM just as it is possible to build them on top of C.)

nimrodm
  • 23,081
  • 7
  • 58
  • 59
  • 2
    LLVM has garbage collector support [more here](http://llvm.org/docs/GarbageCollection.html) – Robert Zaremba Mar 16 '11 at 16:31
  • 16
    @Robert Zaremba Have you ever tried to implement garbage collection with LLVM? I have. You basically must do it all yourself (they don't even provide a simple garbage collector, though there is an outdated example floating around). LLVM just provides intrinsics for your code to hook into the GC. As opposed to the JVM, which provides a built-in mandatory garbage collector which automatically works on all objects. – mgiuca May 13 '11 at 06:54
  • Isn't one of the differences too that JVM is almost like an interpreter, in that the user needs to have it installed to execute programs, while LLVM is used to generate the architecture specific executables in advance (I may be very mistaken, just started learning about this)? – User May 12 '20 at 07:57
  • (Probably I'm describing JIT vs. AOT, where JVM is more commonly used for JIT and LLVM AOT?) – User May 12 '20 at 08:00
32

It's too bad this question got off on the wrong foot. I came to it looking for a more detailed comparison.

The biggest difference between JVM bytecode and and LLVM bitcode is that JVM instructions are stack-oriented, whereas LLVM bitcode is not. This means that rather than loading values into registers, JVM bytecode loads values onto a stack and computes values from there. I believe that an advantage of this is that the compiler doesn't have to allocate registers, but I'm not sure.

LLVM bitcode is closer to machine-level code, but isn't bound by a particular architecture. For instance, I think that LLVM bitcode can make use of an arbitrary number of logical registers. Maybe someone more familiar with LLVM can speak up here?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Owen
  • 465
  • 4
  • 10
  • 1
    "I believe that an advantage of this is that the compiler doesn't have to allocate registers, but I'm not sure.". Not sure about that. ISTR the advantage is that stack-based is easier to verify. – J D Apr 26 '10 at 18:44
  • 1
    "I believe that an advantage of this is that the compiler doesn't have to allocate registers, but I'm not sure." - LLVM based compiler does not have to deal with register allocation - it is more of form of SSA. LLVM/JVM to run efficiently **must** do it as in general memory is much much slower then CPU registers (or rather even slower). – Maciej Piechotka Nov 08 '10 at 19:39
  • 9
    Loading values on stack is **disadvantage** from performance point of view. Look at [this](http://static.usenix.org/events/vee05/full_papers/p153-yunhe.pdf) pdf. – om-nom-nom Apr 08 '12 at 13:55
  • what's the diff between bitcode and bytecode? do they mean the same or is there something else? – asgs Sep 27 '17 at 08:51
  • llvm IR (intermediate representation) assumes you have infinite registers to work with, the llvm back end will map those registers to physical registers depending on the architecture you are targeting. – wfbarksdale Mar 13 '18 at 03:44
4

JVM bytecodes and LLVM bytecodes have similarities and differences. In terms of similarities, these are two intermediate program representations. Thus, they can represent programs written in different programming languages. As an example, there are frontends that translate Java, Closure, Scala, etc into JVM bytecodes, and there are frontends that translate C, C++, Swift, Julia, Rust, etc into LLVM bytecodes.

This said, JVM bytecodes and LLVM bytecodes are very different in purpose and in design. Historically, JVM bytecodes have been designed to be distributed over a network, e.g., the Internet, and interpreted in the local computer, via a virtual machine. That's one of the reasons why it's stack based: usually, stack-based bytecodes are smaller.

Perhaps, in its beginnings, the LLVM bytecodes have also been thought to be interpreted, but if it happened, its purpose has changed over time. So, LLVM bytecodes are a program representation meant to be analyzed and optimized. It is encoded in the Static Single Assignment format, which is more like a mathematical abstraction of a program than an actual, executable, assembly. So, there are instructions like phi-functions in the LLVM IR that do not have a direct equivalent in typical computer architectures, for instance. Thus, although it is possible to interpret LLVM bytecodes (there is a tool called lli that's part of the LLVM toolchain, that does that), that's not the most important way in which the LLVM IR is used.

-4

Java is a programming language, which uses the JVM as a means of "Just in Time" (JIT) execution, whereas LLVM is a compiler construction kit aimed at developing new languages and front-ends for existing languages. LLVM does have a JIT engine, but it need not be used if you don't require it. You could throw out LLVM assembler, byte-code or platform specific assembler instead of using JIT execution.

Edd Barrett
  • 3,425
  • 2
  • 29
  • 48