2

First of all, this is not your standard "I want to compile Java code to machine code" question.

I'm working on a compiler written in Java, that will translate a certain language (in my case: Brainfuck) to x86 Assembly, after that I'm currently planning to use NASM and GCC to produce machine code.

Seeing as the HotSpot JVM can compile Java bytecode to machine code, I assume there is some mechanism available to compile source code of type A to machine code.

Is there any way to use this in a compiler written in Java? My main goal is to explore the possibility of writing a compiler in Java without relying on external programs, for example GCC and NASM, being available on the path. I do need a C Compiler because I'm linking with the cstdlib as I'm using those functions in my x86 Assembly code.

To clarify, I'm doing the following currently:

  1. Write x86 Assembly to a bf.asm file.
  2. Transform Assembly to Object code with nasm -f win32 bf.asm.
  3. Link the Object code with Windows OS and cstdlib libraries with gcc -o bf bf.obj.

I'm searching for the possibilities of replacing the need of using nasm and gcc in steps 2 and 3 and instead do those with Java code.

skiwi
  • 66,971
  • 31
  • 131
  • 216
  • 3
    Can the downvoter please explain why this question does not show any research effort, making it unclear or not useful? – skiwi Nov 18 '16 at 12:20
  • 4
    Well @skiwi, some basic research would tell you what a compiler or an assembler does. And from that, the answer to your question would be obvious. – Stephen C Nov 18 '16 at 12:24
  • @StephenC I know what compilers and assemblers do. Please get to the point instead. – skiwi Nov 18 '16 at 12:25
  • @StephenC I think he knows how an assembler works, given he's written [a Brainfuck interpreter in ASM](http://codereview.stackexchange.com/q/147023/52915). – Mast Nov 18 '16 at 12:25
  • 4
    I don't understand this point: binary executables, in whatever format (PE, ELF, MACH, raw, ...) are just files with a more or less complex structure. x86 machine code is just a stream of bytes. Each assembly instruction is easily mapped to the opcode bytes (once you have carefully read the Intel manual 2). Instead of generating a text file you can generate a binary file. So, isn't this question just boils down to How do I write a binary file in Java? – Margaret Bloom Nov 18 '16 at 12:31
  • 5
    @skiwi I'm not so sure that you do. You're asking how to use a Java bytecode JITer to assemble x86 asm. It makes absolutely no sense. – Jonathon Reinhart Nov 18 '16 at 12:33
  • 1
    @JonathonReinhart No, I'm not asking that. I know there is a mechanism available in HotSpot for a source language (Java bytecode) to machine code. I'm wondering if it's of any use to me for trying to not depend on nasm and gcc. – skiwi Nov 18 '16 at 12:42
  • @skiwi - Actually, that is the same thing. Pretty much. But the answer is that it won't work. – Stephen C Nov 18 '16 at 12:57
  • 2
    Java 9 will introduce [JEP 243: Java-Level JVM Compiler Interface](http://openjdk.java.net/jeps/243) which suggests that the *long term goal* is to implement the compiler in Java, plugged into the JVM via this interface. Once this has been done (Java 10 or 11?), it would also open the theoretical possibility to invoke its code from other Java code, if you manage to arrange the input in the expected form and care yourself about the persistent form of the code. Don’t expect too much. – Holger Nov 18 '16 at 16:52
  • 1
    You can avoid a dependency on NASM if you write asm that the GNU assembler can read. Either `.intel_syntax noprefix` (still using gas directives), or the default AT&T syntax that compilers output by default (e.g. with `gcc -O3 -S`). Using the system compiler to handle assembling and linking is a good idea, otherwise you have to implement all that yourself. But obviously you *can* do so in a single Java program if you want to, like Margaret explained. Note that at no point do you actually need a C compiler, just an assembler and linker. – Peter Cordes Nov 19 '16 at 00:37

2 Answers2

9

Seeing as the HotSpot JVM can compile Java bytecode to machine code, I assume there is some mechanism available to compile source code of type A to machine code.

This does not follow.

The JIT compiler compiles Java bytecodes to native code. It does not understand anything other than Java bytescodes. And bytecodes are not "source code". (They are actually a form of machine code ... for an abstract computer ... a Java virtual machine.)

In short, there is no mechanism available as part of the JVM for compiling source code to machine code.

And, as it turns out, the JIT compiler is not designed for generating native code in files that something else could use. The native code is in the form of raw machine instructions in blocks of memory. No symbol tables. No relocation information. Probably full of hard-wired calls into other parts of the JVM. Basically it is designed for execution in the currently running JVM, not for anything else.

Is there any way to use this in a compiler written in Java?

The JIT compiler is not applicable to your problem ... unless you write your compiler to generate valid Java bytecodes. And if you did that, then the JVM could run your code, and the JIT compiler would at some point compile your bytecodes to native code.


Bottom line: if your goal is to generate native code that can be run as or linked to a separate executable,

  • the JIT compiler is of no use to you, but
  • you could use the JVM including the JIT compiler as your execution platform, by generating bytecodes, and
  • you could use also ordinary Java programming to implement your compiler or assembler, including a component that generates and emits native code in a format that is appropriate to your needs.
Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • Thank you for your answer. I just updated the post to include a list of what I am doing right now, I realized I had forgotten to add this and the question thus may have looked incomplete, sorry for that. – skiwi Nov 18 '16 at 23:13
  • 1
    You could implement all of those steps as a pure Java program, but the JIT offers nothing to help you with writing the required code. – Stephen C Nov 18 '16 at 23:46
6

Is it possible to compile to machine code in Java without an external program?

Yes. Write an x86 assembler in Java.

If you're generating x86 assembly, the next step is obviously to assemble it.

Seeing as the HotSpot JVM can compile Java bytecode to machine code, I assume there is some mechanism available to compile source code of type A to machine code.

Just because HotSpot can convert Java byte code to x86 machine code, doesn't mean it can convert any other input to the same.

You're essentially asking if one can use a Java JITter to assemble x86 asm. It makes no sense.

I do need a C Compiler because I'm linking with the cstdlib

No, you need a linker. Nothing about linking necessitates a compiler.

Jonathon Reinhart
  • 132,704
  • 33
  • 254
  • 328
  • 1
    About the `cstdlib` part ... he also needs the library itself, maybe that one is confusing OP into thinking he needs whole compiler? As C compilers are usually distributed bundled with all basic libraries, so it's common way how programmer gets to them. Plus if you use the compiler wrapper also for linking final binary, it may be easy to miss the existence of linker as separate standalone tool in the toolchain. – Ped7g Nov 18 '16 at 16:54
  • Thank you for your answer. I just updated the post to include a list of what I am doing right now, I realized I had forgotten to add this and the question thus may have looked incomplete, sorry for that. – skiwi Nov 18 '16 at 23:13