0

Since I started a small project to embedded asm for Intel x86 64bit processors within Java and seamlessly run and compile those methods (and maybe later emulate those calls as fall back), I wonder if there is a Java implementation of a x86 compiler already existing. I want support for those advanced instructions as well.

The Goal is to finally have something like:

public int myJavaMethod(long value) {
    doSomeJavaStuff...

    int result = 0;
    asm("             push eax",
        "             mov eax, value",
        "             cmp eax, 0",
        "             jne notEqual",
        "             mov result, 1",
        "             jmp done",
        "notEqual:    mov result, 0",
        "done:        pop eax");

    doSomeMoreJavaStuff();
    return result;
}

What I have:

  • The Intel manual for their OPCodes and the Architecture (VOL2)
  • I officially can compile and run a asm("nop") program part by now using JNA

Current challenges:

Due to the nature of the task I will try and gain access to the underlying JavaVM stack frames - which is possible (remember you get the esp stack pointer and the base pointer ebp even with the addresses the ret op will place in the eip pointer) but couples the solution with a certain java VM + build target, which I do not have a problem with since one can create various target adapters and poking around in the JavaVM implementations will be fun and interesting.

A problem with the above string based solution would be the handling of parameters and results. It might be necessary to provide a parameter mapping like asm(params(result, value, whatever), "opcodes") and place output variables either directly in the right memory slots or the registers the JIT used to fill the variables in the params call or do something like result = asmResult(result) to read the actual changed value from the data section of the asm call and place it in the correct value.

By using Unsafe on those information one can even access Object and Class variables and navigate those values obtaining the addresses and types of all those references used within the asm code. So using asm("mov eax, super.field") would be possible without interfering with Java - unless the JIT again does some interesting optimization on register caching but might be unlikely since asm(...) is a certain call and caching object variable content might violate the concurrency conditions regarding objects. But yet again thats a question for the JVM developers from Oracle.

So if you have any information about existing Java code or C code that can help me with anything like this, you are welcome.

The best help would be a Java implementation of a x86 compiler that already has ways to translate certain asm constructs in a series of bytes (op codes) so I might go the route of simply create a real ASM method from those above calls and the information gathered by inspecting the methods byte code along with Unsafe and whatever source of information.

In the end I want to just write assembler within my Java code without going at length and compile c/ASM, load ship libraries for different OS and platforms. Does not make sense. Once this works one can easily generate "C" function calls on demand and might even possibly bypass JNA getting rid of another dependency.

So if you know of code and projects or have additional information you are welcome.

[Update]

This question does not aim at JNI / JNA. The goal is simply bypass the use of an alien compiler and load a library. The idea is simply compile into memory directly from within Java and start the method by using a pointer to it along with some additional data for parameter and value processing.

Just think about this workflow write C/ASM, compile using gcc, loadLibrary, call method by JNA into the workflow compile (using Java) into memory, call method by JNA. I already can do this I just want to use more op codes without creating my own compiler front-end. It is not about if this possible but about how to do it lazily.

But once I demonstrated the basic feasibility I guess people will come around adding more target platforms (I only target Linux and i5 + Intel Xeon v3 processors).

[Update 2]

Currently I investigate the NASM compiler as being mentioned to be a good and actively developed compiler using the Intel notation. I asked the developers already how a transparent, simple and fast (no disk access for instance) integration would be possible. If this works creating a similar solution for C using a popular compiler would also be possible.

The idea is edit Java source in eclipse, run program done. The compilation of those snippets in a different language and the invocation and all the other fun should be completely transparent.

Martin Kersten
  • 5,127
  • 8
  • 46
  • 77
  • See http://stackoverflow.com/questions/11632078/code-injecting-assembly-inlining-in-java – childofsoong Apr 06 '15 at 23:34
  • 1
    *I wonder if there is a Java implementation of a x86 compiler already existing*. No. – Elliott Frisch Apr 06 '15 at 23:46
  • @soong Question started similar but the discussed solutions are not what I am aiming for and are not valid in this context. I do not seek byte code or time ahead compilation. I want to embedded full ASM within it. – Martin Kersten Apr 06 '15 at 23:56
  • Yeah, I'm afraid your only option is going to be to write native code with embedded assembly. – childofsoong Apr 07 '15 at 00:29
  • Unluckily there seams to be no Java solution at all. I approached the NASM developers. Lets hope I can get the tool to integrate without the need of managing files and losing speeds due disk access. Maybe they can open the compilation process to be accessed by a single method call in a provided library. – Martin Kersten Apr 07 '15 at 01:12
  • @MartinKersten Both of the answers provided here already told you that there is no Java solution. [NASM is open source](http://www.nasm.us/pub/nasm/snapshots/latest/) but also Intel specifications. – m0skit0 Apr 11 '15 at 21:24
  • There is at least one Java solution now. I will open source it once it works as desired. Currently I write a ELF64 loader in Java that will load the code segments, do the relocation and symbol binding and invoke it using a JNI /JNA lib. I can load some ASM but external symbols are still tricky to bind. So yes there is a Java solution right now, I can use it to access memory, manipulate it do jumps and all such stuff but when I want to use external symbols / functions I still have to do some more work. – Martin Kersten Apr 12 '15 at 02:27

1 Answers1

2

I don't see how you can actually do this. I think you're missing a lot of things here, main things that come to mind on a first thought:

  • Java compiles to bytecode, which is nowhere near Intel's asssembly. JVM is a stack-based machine, while Intel CPUs are (mainly) register-based.
  • Java has no direct access to hardware or system calls since it runs on a VM, when most inline assembly is for this purpose.
  • Won't run in non-Intel CPUs (but I guess you already took account for this one), and probably also you would need to target a specific OS as well if you plan on doing system calls, which defeats one of main (forgotten) goals of Java.
  • You would probably need a custom JVM to be able to run native instructions directly.

It could be theoretically possible, but definitely not a trivial matter.

If you meant an Intel assembler or emulator in Java, that's another story.

m0skit0
  • 25,268
  • 11
  • 79
  • 127
  • As I wrote I can compile into memory and execute the asm function already (JNA). But adding all those nifty op codes myself is ... well ... lots of work. – Martin Kersten Apr 06 '15 at 23:58
  • For compiling and writing you just use JNA (you can also write a C program for that and use JNI) to allocate a memory block, set it to executable and store your op codes in it. Call another c method giving it a pointer to your method in memory as parameter and either use jmp or better push the pointer on the stack and use ret op code. This way you can even come back from your asm method snippet and do some additional work like mapping results to java again. – Martin Kersten Apr 07 '15 at 00:01
  • >It could be theoretically possible, but definitely not a trivial matter. < -- Already did it. I just want to have more possibilities – Martin Kersten Apr 07 '15 at 00:02
  • Also there is no custom JVM necessary the only thing that comes into play is the inner working of the JVM which is comparable for minor versions and the same vendor on the same platform and of cause the OSS. – Martin Kersten Apr 07 '15 at 00:04
  • Losing the runs everywhere promise is actually a 'who cares' issue and also I count on people adding more target platforms, OS support and chips on the go. – Martin Kersten Apr 07 '15 at 00:04
  • No, you didn't do it. If you did it, you wouldn't be asking, would you? NOP is trivial, but how would you implement say INT? IN? OUT? What about special memory addresses, like mapped memory? If you're doing it through JNI you're actually not doing it through Java, but through native libraries. Again, I think you should read more about x86 assembly. Not all assembly is as trivial as the example you put. – m0skit0 Apr 07 '15 at 08:05